Screencast. Voice input is always on and multiple translations from tatoeba

Hello. I wanted to share the way I’ve augmented clozemaster website so it’s more convenient for me.
I study Polish,

I wanted to practice speaking and multiple choice was too easy since Ukrainian and Polish are close. But I’ve quickly got tired clicking Record button.
So I’ve set up a greasemonkey script to autoclick Record button. It’s always recording my voice now.

Then I’ve noticed that like 10% of sentences have mistakes in translations or not exact translations.
So I’ve made it to show me all translation variants for the sentence in languages I already know [Ukr,Rus,Eng] using other tools.
It didn’t actually solve the issue. Because with gender verbs half sentences could be male and half female still, etc.
I was thinking to employ ChatGPT to translate from target language, so I would have a single correct sentence and don’t rely on tatoeba.

Maybe Clozemaster admin decides to implement this on website itself.

Feel free to share your own videos too.

1 Like

Awesome! Thanks for sharing. And welcome!

Nice when there is a wide range of translations, it is much easier as Tatoeba translations are often not precise. Is ChatGPT necessarily any more reliable? I mean it may well have used Tatoeba for its training data…
I have also struggled to find a smooth way to deal with gender verbs. I thought about using the “Hint” option to give a clue, or adding the other gender as a permitted alternative answer. Neither way is ideal because you have to edit the sentence manually. However I find that I want to check each sentence myself the first time I see it because I don’t trust the TTS pronunciation to be error-free, so I can edit it then…

1 Like

I’m implemented ChatGPT translation after a couple of days. Have been using its translations for about a week.
These translations more accurate, the words between languages are more similar. Now I make fewer mistakes. And feel less frustrated, because tatoeba mistakes were irritating.

On these forums I’ve seen a topic where some people actually prefer translations that are not direct translation but more have some variety. I prefer more direct translation at the moment.

ChatGPT is also not perfect. But I feel like it’s not accurate in 2% sentences, while in my Polish/Russian fast track it feels like inaccuracies like 10%.

2 Likes

I’m impressed that you were able to set up this system, but aside from the recording part, I don’t have a clear understanding which problem you’re trying to resolve, and how this approach can help.

There are two major kinds of issues that could lead people to give correct answers that Clozemaster doesn’t consider correct:

(1) There are multiple correct possible ways the cloze in the sentence could be completed, but Clozemaster accepts only one of them, and when the player fills in a different one, it’s marked wrong. This could happen for different reasons:
(a) The player types in a synonym of the word that Clozemaster was expecting.
(b) The player types in a word form that is a valid match for the corresponding word in the translation, but Clozemaster was expecting a different form (for instance, masculine vs. feminine, or singular vs. plural) that would also match. I think this may be what you mean by “with gender verbs half sentences could be male and half female”.

(2) There is an actual mistake in the translation alignment; the sentence and the translation don’t match.

Problem (2) comes up much more frequently for sentence pairs for which Clozemaster imported indirect translations (translations of translations) from Tatoeba. Tatoeba explicitly says that indirect translations cannot be relied upon. However, for certain language pairs where the number of alignment pairs at Tatoeba is small (which is especially common when neither of the languages are as common as, say, English, French, or Spanish), Clozemaster made the decision to import indirect translations anyway, perhaps with some manual effort attempt at filtering/correction that nonetheless allowed a fair number of errors through. I’ve only played languages from pairs that have a lot of sentences at Tatoeba, and thus, I think all their content was imported from direct translations rather than indirect translations. I do run into errors, but not many.

In any case, problem (1) is unavoidable unless Clozemaster were to reject all sentences but those where there is a single unambiguous choice for the cloze. This means that each sentence selected would need to provide enough clues regarding the gender and number (singular vs. plural) of the missing word to rule out all other choices. Clozes for which synonyms exist would also have to be ruled out, and I’m not even sure whether this is possible on a large scale. When someone writes up a language test, they can select sentences where the gender and number are indicated by other words in the sentence or by context from surrounding sentences, and they can rule out synonyms by specifying the first letter of the cloze. But both are labor-intensive, probably beyond what Clozemaster can handle. (In addition, providing the first letter of the cloze makes the word easier to guess, which may not be what a Clozemaster player wants.)

Providing translations in multiple languages from Tatoeba would not help address problem (1). If the translation is in a language such that the gender and number of the cloze word are unknown (let’s say it’s an English verb), person A could translate the word into Russian as a female verb form, while person B translated it into Ukrainian as a male verb form. If you’re using the Ukrainian-from-English language pair, the fact that the Russian word corresponding to the cloze has a female verb form will actually mislead you into thinking that the Ukrainian should, too.

And ChatGPT can’t help you here, either. It has no idea whether person B translated the English word as a male or female verb form. Likewise, it has no idea which of a number of synonyms the translator used.

The thing that does help in this situation is to enable the game settings “Typing Color Hint: On” (which, as you type letters into the cloze, will cause them to be displayed green if correct or red if false) and/or “Text Box Size: Changes” (which causes the size of the cloze to match the length of the text). Are you using these options?

I tried to solve the indirect translations sometimes being wrong. It was irritating and I was feeling like I had to remember that this particular sentence is incorrect. I tried reporting sentences a few times, but if it’s not corrected within 1 day, that sentence is a trouble.
ChatGPT translation helps to avoid most of the mistakes.
Voice typing is my choice. It’s faster, and I want to believe helps with speaking skill.
Rarely I use text typing when I can’t pronounce a word correctly a few times.
I’ve disabled “Text Box Size: Changes” since I wanted a bit more challenge.