yes, but, lol
For English to less popular languages, the Clozemaster database has thousands of sentences per language. For English to more popular languages, tens of thousands apiece… with zero overlap. (Lots of the source sentences are the same across languages, but that’s immaterial here— if you translate one sentence into 23 languages, you’re still going to need 23 separate explanations, each one written individually from scratch.)
Soooo. In total, that’s at least several hundred thousand explanations, and possibly even millions of explanations—of which a skilled human could only hope to write a few per hour, with nowhere close to the word-by-word treatment that chatGPT outputs.
Even at minimum wage levels, that’d be millions of dollars’ worth of work. Realistically—given how few people are actually capable of writing clear and correct grammar explanations (most native or otherwise fluent speakers of any given language rely completely on experience and intuition, so only a tiny minority of them will even have any explicit idea WHY each piece of a randomly chosen sentence is correct… and out of that already tiny minority, far fewer still will consistently be able to formulate written explanations that are clear and cogent enough to be understood by readers who are studying alone)— you’re looking at tens of millions of dollars, given the kind of wages you’d have to pay for work of that quality. Imagine the Pro fees we’d have to fork out to cover that😵💫
… and that’d be the cost of human-produced explanations just in English, ahhahhaha. Not even touching all those other source languages.
I mean. There are other corrections I would love to see on here—especially for pairings of two non-English languages, which contain huge numbers of actual errors in the base Q&A (not just in explanations).
E.g., Russian from Italian, and Italian from Russian:
These databases—at least the great vast majority of them—have clearly been constructed by just daisy-chaining data from the English-Russian and English-Italian databases. (This is the best that can be hoped for out of all automated procedures, since there’s no such thing as direct auto-translation between non-English languages—ALL existing auto-translators from western tech companies route everything through English, which would introduce even more errors and ambiguities, like way way WAY more.)
The problem is, because English is the intermediate stopover point of this process, absolutely everything that’s baked into Russian or Italian but not English grammar—and absolutely every instance of distinct forms in Italian or Russian whose closest English translations look the same—is lost going TO English, and therefore has to be randomized (basically just “guessed” by the system) FROM English to Russian/Italian.
The three most common of these are:
• Russian past-tense verbs are gendered; English and Italian verbs are gender invariant.
• Possessives are gendered to match the possessor in English and Russian—not in completely matching ways, but crudely enough to make a go of it—but do not change with the gender of thing possessed.
Italian (like every other Romance language) is the opposite on both counts: “her”, “his”, and “its” are all the same word in Italian—but that word takes on different forms depending on what fills the blank in “his/her/its _____”.
• English only has one “you”—whether singular or plural, format or informal.
Russian has one word for informal singular “you”, and another that doubles as formal singular “you” and plural “you” (also how French does this).
Italian has one word for informal singular “you”, a second word for informal plural “you”, a third word for formal singular “you” (which is the same word as “she”!!) and a FOURTH word for formal plural “you” (which is the same word as “they”!).
Just from these 3 differences alone, in either direction of Russian <—> Italian there are loads of wrong translations in the database.
• Loads of stuff ends up gendered the wrong way.
• EVERY instance of “you” is always completely randomized between formal/informal in both Russian and Italian, very often in non matching ways (since English has nothing even remotely close to formal/informal register). Singular vs. plural “you” is also, separately randomized in both RU and IT, unless it’s fixed by some other word(s) in context… and the "you"s even occasionally cross-pollinate with “she” and “they”, too.
• not even getting into the hot mess with possessives.
where the point is, if you were going English to Italian or English to Russian, then you’d constantly have to guess these things (and add all the other possible forms into the “Alternative correct answers” box)—but at least you’d know you had to guess.
In Russian to Italian or vice versa, on the other hand, plenty of the translations are unambiguously WRONG for reasons that are rlly nobody’s “fault”, and that are unfixable except by a person with professional-level competency in both of those languages.
These would be nice to fix first, before addressing peripheral features like longform explanations—but again, that’s laughably unaffordable for a small developer.