Yes, I defined the cloze myself. For this workaround test, I only added a few sentences. For my other 3,500 sentences, I use Python to compute statistics on a corpus of Cantonese text that I’ve collected, and then select cloze words using that data.
Awesome! I really appreciate that! I’m going on vacation for a week starting July 27, 2018, and was hoping to be able to do lots of studying during this time. If there is any chance this change could be made before July 27, that would be so great, but I also understand if that isn’t possible.
I have a collection of 3,500 sentences. Unfortunately, 3,000 of the sentences are not in the public domain.
I plan to create at least 1,000-2,000 more sentences in the next 12 months. I would be happy to release these into the public domain.
I am part of a network of around 250 highly motivated Cantonese learners. If clozemaster supported custom cloze-collections, I would very strongly recommend clozemaster to this group. Custom cloze collections is a killer feature for Cantonese because learning resources are scarce and tend to be plagued with errors.
Perhaps this could result in 5-10 additional clozemaster pro subscriptions. Additionally, this network could help to bring more public domain sentences to Cantonese.
For standard Chinese (Mandarin), there are JavaScript libraries which can generate the Pinyin pronunciation for arbitrary Chinese sentences. Automatically generating Pinyin could simplify the process for adding custom Chinese sentences, since the pronunciation field could be computed automatically. I didn’t get a chance to see if you are already doing this.
For Cantonese, it is more difficult to generate pronunciation information. Pinyin is the official romanization system for Mandarin, but there are multiple non-standard romanizations for Cantonese. The most common romanization systems for Cantonese are Jyutping and Yale. This presents some challenges:
- Automatically generating pronunciation information is more difficult for Cantonese than standard Chinese. It is still possible though.
- If you don’t automatically generate pronunciation information for Cantonese, users will have to enter their own.
- Since there is no standard romanization for Cantonese, users will enter Jyutping or Yale, or even worse, a mixture of both! Non-standardized pronunciation information makes it harder to share custom cloze-collections.
Ideally, I think both Jyutping and Yale would be generated automatically and the user could see both romanizations.