What happens after sentences/translations are updated?

I have submitted hundreds of error reports, and three of them are accepted so far: two are made on English translations for the Indonesian course; and one is on a Japanese lesson sentence (i.e. in the target language). I’m so glad Clozemaster is more actively handling user reports these days. :smiley:

I have three questions about the updates:

  1. Sentence discussion – I left a comment and alternative translations on the sentence discussion page (topic ID: 26366) in addition to filing the error report. As per my suggestion, Clozemaster kindly updated the unnatural Japanese sentence. Today, I got an auto email notification saying "the cloze-word was updated from {{忠実}} to {{誠実}}, but the header of the SD remains as 忠実. What if I redo the updated correct lesson and push the SD icon? Does the system A) lead me to the same previous SD page under the same topic ID, or B) does it create a completely different new topic ID? If the answer is the latter, that means my previous “hard-core” explanation is hidden from fellow learners unless I manually link the new SD to the previous one. FYI: Duolingo also frequently updates sentences in target languages. The URLs of Duolingo SDs remain, but the headers are replaced with the new ones. And there is no original sentence information, so post-update readers of the SDs often get confused what the previous learners were talking about.

  2. Different collection – The abovementioned update on the Japanese sentence changed the cloze-word. If the old and new cloze-words belong to different rank of frequency, what happens after the update? Let’s say… the old wrong {{忠実}} was in Most 2,000 collection, and the new {{誠実}} is now Most 3,000 – the numbers are just hypothetical. Does the system decrease the total number of “played” sentence from the Most 2,000 and add onto the Most 3,000 as a “new” sentence?

  3. Source link – The updated lesson gives a source link to Tatoeba (Tatoeba sentence ID: 176835). But the sourced sentence remains as the wrong {{忠実}}. I’m not saying Clozemaster should ask Tatoeba to update it too because Clozemaster cannot behave on behalf of the error reporters. But the mismatch (different version) between Tatoeba and Clozemaster may confuse other fellow learners. They will wonder which one is the correct/better one? Moreover, most of the sentences are published on Tatoeba under CC BY-SA 2.0 license, which explicitly requires other users including Clozemaster to indicate if changed are made from the original works. So. Clozemaster system should display a “modified” icon or something like that in order to comply with the license requirement. And the ShareAlike (SA) option means “If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original”. And the original license requires Attribution (BY), which means Clozemaster shall also display my username (MsFixer) as the author of the derivative work in addition to the appropriate credit to the original author from Tatoeba. This may be technically challenging especially for mobile apps with limited space on a screen. The only feasible way I can come up with is to compile all change histories in one web page, and the “modified” icon is linked to the change history page. But I guess the Clozemaster team has a better idea.

I’m sorry to drop the three huge bombs at once.


Thanks for the post and all the reports!

  1. B - a new topic ID is created. This is not ideal for the reasons you described, though we’ve opted for it for now to avoid the potential confusion in updating the sentence but leaving the existing discussion like you mentioned.

  2. At the moment the changes are not that smart - the sentence is updated in the collection it belongs to without regard to the frequency ranking. At some point we’ll figure out redistributing sentences after updating if needed. For now we’re simply focused on resolving the reports and improving sentence/translation quality.

  3. Source link - good points here as well. A single page that lists all changes is a good idea.

Work in progress on all three. Any other questions or feedback please feel free to let us know course :slight_smile:


Thank you @mike for taking my questions!

  1. Got it. I’ll manually link previous SDs to the new updated ones for a while. If you like the idea of single page for all change logs, maybe you can embed the hyperlinks to the previous SDs into the log page. This can be done by a bot in the future, so it won’t bother report handlers and will let them focused.

  2. Redistribution is critical because I found many user reviews highly appreciating the well-organized structure of collections based on word frequency. It’s like a great navigator for mountain climbing. At the same time, however, it won’t be that easy. One of the major complaints from Duolingo users is “My tree was fully updated and my progress is all gone! Let me go back to the old stable version!” I remember so many users complained when Duolingo completely restructured the “units” based on the CEFR framework last summer. For example, I saw a comment like this: “I was so close to finish the Unit 3, but now I have half undone. Don’t move the goalpost without my consent!”. With these incidents, I personally think that it’s better for Clozemaster not to redistribute each updated sentence one by one, but to do so in a bundle as a “major” update. If possible, the pre-redistribution and post-one are both available and each Clozemaster user can choose, like a typical version management of software.

  3. I have to apologize for my mistake. Most of Tatoeba sentences are published under CC BY 2.0, not CC BY-SA 2.0. So, you don’t need to display the name of error reporter. But still, you need to indicate if changes were made.

I would like Mike to send my thank-you note to all the members of user report handling team. I believe adding a new language course with + 10K sentences can be done in several hours by a tech savvy person. But it may take the same time of report handers’ time to update only 10 sentences manually. And there are so many unprocessed error reports piling up.