NSFW Not exactly functional - can it be merged?

I appreciate the impulse behind the attempt at separation (even if I think it’s not really needed), but there are a number of issues in implementation that are problematic.

First, and most critically: This separation means a number of completely benign CLOZE words may not be available to the student. The “NSFW” filter seems to look at overall sentence and its general tenor, not the cloze.

Italian Example : Voglio morire, però (CLOZE) non posso.
The cloze “però” means “but” - and is a basic building block word. Even if one thinks this mild sentence is NSFW, the cloze itself is needed in any functional learning program.

Secondly, the filter seems to look more at translation, not always cloze or sentence, excluding a number of benign phrases.

Italian Example : Oggi fa un freddo cane (CLOZE) (translation: “today is f-ing cold”).
There are two issues here: The first is that this sentence simply means “it’s super cold today” - there is NO obscenity in it at all, even if the English translation uses the f-word as an amplifier. The second issue is the same as above: The cloze simply means “dog” - a benign and very common word.

Meanwhile “coglioni” plural (balls) is in Fast Track (“Lui ha i coglioni” = He has guts), while “coglione” singular is shunted to NSFW.

I really don’t want this track pulled out separately for my learning - will there be an option to merge it back?

6 Likes

Thanks for the feedback! This is all helpful to know.

We’ve received more than a few complaints about NSFW content, and most of them very reasonable. That sort of content isn’t necessary to learn the language and it makes Clozemaster not as enjoyable or simply not usable for some users (teachers for example, people playing with their kids, etc.). Some people seem to fine with it, however, like you mentioned, and some even seem to like the occasional NSFW sentence, so that’s why we’re moving those sentences vs simply deleting.

This is correct, it’s not just the cloze but the entire sentence.

This is correct as well. Trying to find NSFW content is a challenge across 60+ languages and millions of sentences. Our approach so far, being completely transparent, has been to pass the English translations through an AI/NLP content moderation filter and then having a human manually go through thousands of flagged sentences to decide whether they’re actually ok, NSFW, or should be outright deleted (especially inappropriate sentences). The rationale is to start with this approach since 1) English is probably the best trained AI filter, and 2) we just need someone who knows English to go through the flagged sentences. Attempting to find language specific NSFW content like you mentioned will have to be in a future iteration for now.

To your point about obscenity only in the translation, we still don’t want that translation coming up, even if the sentence is ok like you mentioned.

Not at the moment. You could add the ones you’d like to your reviews and then they’ll show up if you use the “Review All” feature, or add the ones you want to a custom collection.

We’re open to figuring something else out if the general consensus is that people want the NSFW content included. Our thought, however, is that its absence would be largely unnoticed given it doesn’t affect many sentences and there’s still plenty more content (for the Italian Fast Track, for example, only a couple hundred sentences were affected, < 2%, leaving 18,000+ sentences, which a very small percentage of users will ever even fully play through), meanwhile one inappropriate sentence could turn someone off from using Clozemaster entirely.

1 Like

i definitely would miss the NSFW content, i enjoy that this site isn’t as ‘tame’ as sites like duolingo and allows real unfiltered language that people would actually use on it. besides, if you never learn those words, how do you intend to communicate in other language when you need to understand or use NSFW words? like imagine if someone learning english never learned swear words, they’d have no idea how to interact with others, to understand a lot of content in english (e.g. stand-up comedians would be completely impossible for them to figure out, or even 80s action movies where people swear every few sentences).

like imagine you learned a foreign language, dated or married someone who only speaks that language, and they wanted you to talk dirty to them, and you had no idea how to do so, because you never learned NSFW words in their language.

2 Likes

Totally agreed, they’re just being moved to a separate collection so they can be avoided if you’d like. Only the especially offensive and inappropriate sentences are being deleted - swearing for example is being kept, sentences that are just meant to be offensive, swearing or not, are being removed. We might also consider trying to expand the NSFW content in the future given the benefits to your points.

3 Likes

@mike

Child protection is one of the reasonable reasons for NSFW being isolated. However, the current way of isolation isn’t the best approach. Rather, I would suggest the Clozemaster team to implement a “tagging” system as a long-term reform.

On Tatoeba (the sentence source), for example, each sentence can be tagged by the admin (i.e. public tags) or by each user (i.e. private tags). The number of tags is unlimited. If you implement a similar tagging system to Clozemaster, each user can filter out sentences with specific tags such as NSFW from their personal setting option. Or we can play sentences with specific tags only. You don’t need to “duplicate” a sentence by adding to another personal collection.

The tagging system has four more advantages over the current “add to personal collections” approach.

  1. A sentence often belongs to more than one public collections such as Most Common Words Collections (MCWC), Fast Fluency Track (FFT) and Random Collection (RC). They are physically cloned and not integrated in a central repository. So, if the admin updates a sentence in MCWC, the update isn’t automatically applied to the same sentence in FFT and RC. Why don’t you just integrate them into one in the central repository and tag with MCWC, FFT and RC. If you want to additionally tag a sentence with “NSFW”, you don’t need to do so three times across MCWC, FFT and RC.

  2. I sometimes add MCWC sentences to more than one personal collections for different learning objectives. For example, a sentence is very difficult and required additional reviews in the text input mode. Also, the same sentence is good for the full-sentence transcribe mode. In this case, I add the sentence to two collections. If I want to modify the note section, I need to edit three times. And the number of ready-for-review is tripled.

  3. With the tagging system, you can also organize sentences by the degree of difficulty (e.g., easy, normal and difficult). And you can customize the interval for review for each difficulty tag.

  4. I remember on a different forum discussion @rinkuhero proposed to organize collections by genre/topic (e.g., professions, food). To be honest, I don’t need this function but I do understand some people like this idea. So it’s suitable for “private” tags.

All including NSFW can be realized solely by the tagging system.

@Dcarl1
I don’t speak Italian, but your examples in Italian are easy for me to understand your point. Thank you for bringing up this issue.

5 Likes

Lately, I’ve been thinking along the same lines as you, @MsFixer, that a tagging facility would be extremely helpful. I was going to propose it myself, but you beat me to it. :slight_smile:

Maybe you could write a separate enhancement request?

4 Likes

Er aber, sag’s ihm, er kann mich im Arsch lecken.

“O Romeo, that she were! Oh, that she were
An open arse, and thou a poperin pear.”

After mulling a few days about the whole NSFW concept, I find it might be - once again - good for business, but otherwise it is … (only NSFW words would do).

You simply can’t learn a language without NSFW content. Neither Goethe nor Shakespeare (as above) not to mention Boccaccio or Catull are accessible in full without.
And for those not into the classics I might add GoT and any crime series to my list.
Recently, I found that in “Petra Delicato”, an quite unremarkable Italian crime series, I had much more problems to follow the private life talk than the criminal cases due to my limited understanding of NSFW dialogs. And that has a 12-year+ rating!

As a teacher - or designer of a teaching platform - one has the obligation to teach what’s necessary and not what’s might be fitting to the world view of the learner. The teacher knows better - otherwise the whole concept of teaching falls to pieces.

Additionally, the criteria are arbitrary or - worse - biased. “Suicide” seems to condemn a sentence to NSFW but “crucify” does not. Hooray for the fans of torture!

Let people sort out themselves what they don’t like to learn, but don’t decide for me and don’t let the majority decide for me.

5 Likes

Thanks for all the input! Good points regarding a tagging system. We’ll have to give it more thought. There are additional contexts to consider - how downloads would work in the mobile app, how to handle multiple instances of the same sentence but with different clozes (for Grammar Challenges for example), how to actually migrate everything if we wanted to change the underlying architecture, etc. It would indeed be a long-term reform like you mentioned @MsFixer, but there are some worthwhile potential benefits like you mentioned as well.

My initial thought is that the NSFW collection achieves exactly what you’ve described - if you’d like to learn NSFW content, you can play the NSFW collection.

Respectfully, I think your post is an overreaction. Here are some actual Clozemaster sentences that are flagged to be moved to the NSFW collection:

NSFW
  • That is not art. That is a vagina with teeth.
  • A severed penis constitutes a solid piece of evidence for rape.
  • It’s easy for a slut to go from guy to another.
  • I’m going to find you wherever you are and kill you.
  • Did you want to cut off his head?
  • Which of the two has the bigger penis?

We want high school students to be able to use Clozemaster. We want teachers to be able to use Clozemaster. We want people who don’t feel like seeing a bunch of sentences about suicide and killing people to be able to use Clozemaster.

You really can. If I’m traveling to Spain I’m not sure why I’ll need to learn a bunch of sentences about suicide, for example. And I’ll probably do fine in my German class without “er kann mcih im Arsch lecken.”, even then I’ll probably still be able to figure it out even if I never see it on Clozemaster.

It’s best effort, there are some misses like you mentioned, but hopefully we’ll catch the biggest offenders. And nearly all of it is still available for you to learn, just moved to a separate collection. We are open to and we are considering alternative solutions of course.

4 Likes

You want high school students to be able to use clozemaster?

O please. They are very much at ease with these things. It is always the parents trying to shield their “innocents” from things they already know, understand and do.

But even so, do you really think our children are better off without the sentence in NSFW:
“Non toccarmi mai più!” (Don’t touch me again!)
It is a terrible fact but I think scores of children are in dire need of this sentence everyday and in every language in the world. And you keep them from learning it because someone “smells sex” here.

You don’t want to learn about suicide? Today there is an article in “Messagero” about the suicide of Giovanna Pedretti with political implications. Want to cut off yourself from actual political discussions?

I think, you are knowtowing to a lot of ideologies here, fearing the loss of cistomers.
Maybe I am overreacting.
But I am sure that everybody who tries to be the “benign censor” is going to look foolish in the future.

2 Likes

I agree with you, but we don’t determine what high school or younger students learn, not even their teachers do; only their parents have that authority. Therefore, if the Clozemaster team wishes to be more accessible like Duolingo, they need to hide the NSFW content that may be alarming to some people. :sweat_smile:.

5 Likes

I believe this NSFW section is necessary, for reasons already mentioned in this thread.
However I believe there are too many sentences that ends up in this category - you really need to have a dirty mind to find them inappropriate…

Explicit sentences about having sex definitely belongs to NSFW, but why simple sentences about hugging and kissing? I guess a child can appreciate a hug from its mom or dad.

I know suicide is a sensitive topic - but I believe it could be good to know how to express for example “I cut myself on a knife” (seeing a doctor after the kitchen accident)

I partly agree with @anon94972132 , some language seems to have a lot of “maybe not so nice words” and learning them is part of learning a language. Yes, you can learn a language without them but most likely you will end up in situations you don’t fully understand.

I guess I can find more examples, but anyway…
As I said, I believe the section is necessary, but the selection (filter/algorithm…) is a bit of for now . Therefore I have decided to do this selection in order not to miss a useful word.

2 Likes

@mike
Good to hear that my idea about a tagging system is worthwhile to further weigh as a possible option.

As a short-term solution, it’s still important for grown-up learners to have NSFW contents merged into the others.

The current separation is like this: You ban zombie movies at theaters in all states except Alaska, and encourage lovers of such movies to go to Alaska. But the problem is that they have to watch ten different zombie movies in a row. One in ten is nice, but ten in ten isn’t enjoyable.

Rather, people here are asking Clozemaster to keep bundling different types of movies including zombie ones. And then, build a new kids-friendly theater separately if necessary. That’s the way Duolingo does on the “Duolingo School” platform.

As other people pointed out, the NSFW filtering criteria and algorithm should be also reconsidered. Isn’t it too early to let AI pick up NSFW?

3 Likes

What prevents children from playing that NSFW collection?

@davidculley : Nothing prevents it. And in fact pulling out “the good stuff” sets it aside for easy exploration rather than needing to wade through 20K sentences to find it…

As an analogy: In the 1920’s, when James Joyce’s Ulysses was a censored book, all those people who tried to read it for “the dirty bits” were in for a huge surprise when they had to wade through hundreds of pages of rather confusing modernist prose to find them.

But that’s an editorial decision here, therefore so be it, and let people then decide how to approach it. Me, I am including it - and yes i said yes i will yes.

Why explicit language needs to be part of Clozemaster

To become fluent in a language, you need to understand both explicit words (such as swear words) and explicit content.

Imagine taking a trip to Latin America, taking a Salsa/Tango course, meeting a lovely Spanish-speaking woman/man, one thing leads to the other, and as soon as the topic switches to having (safe consensual) sex (between two adults), you’re blanking out (because you (self-)censored all the NSFW stuff).

Why I dislike the current solution

That’s a good point.

As a grown person who wants to become fluent in a language, you need to be exposed to explicit language because—like it or not—people in the real world use explicit language. As a grown person, I myself don’t mind if, once in a while, a swear word or otherwise explicit language appears in my flashcards, admist all the non-explicit flashcards.

But if all the explicit language is pulled out of the regular review schedule and moved into its own section, it might be too much to read sentence after sentence after senctence about all the nasty stuff, inluding depressing topics such as killing or suicide, and only about that (for dozens or even hundreds of sentences a day).

It’s even harmful from a pure learning perspective because it entirely removes the Interleaving aspect (and partly the Spacing aspect) from the STIC method (Spacing, Testing, Interleaving, Categorizing).


Proposal for different solution

@mike: I’d much prefer if there was a setting in the account settings, asking you whether you want to see explicit language, yes or no.

(This is better than asking whether you’re over 18, yes or no; because being over 18 does not necessarily mean that you want to see explicit content.)

  • Using a tagging system or whatever, the sentences containing explicit language would then simply not be shown if you activated the “no explicit language” setting.
  • If you want to practice the explicit language, then it would be shown, integrated into the regular collection instead of being ripped out into its own, separate collection.

Which brings me back to my other question:

What would prevent minors from enabling the explicit language content?


A better name for the NSFW collection

I would prefer naming it “explicit language” rather than NSFW content, like the music industry does with albums and singles.

For a start, not everyone knows that NSFW stands for “Not Safe For Work”.

Besides, who uses Clozemaster at work?

Furthermore, a colleague of mine recently became a dad. Everyone knows that you first have to have sex before you can become a dad. So you can talk about sex even at work (in an innocent, decent manner, of course). So it’s not really whether it’s safe for work or not (which is subjective) but whether the language is explicit or not.


Violence is inacceptable

P.S.: Just to make it very clear: I am against all forms of abuse and violence, sexual or non-sexual. I do not condone anything that any decent human being would find terrible. I am simply a language learner who wants to become truly fluent in my chosen language.

Interestingly it seems that most children I know are already well-versed in most of what is contained in NSFW and more. Personally I can choose whether or not to venture into NSFW so I’m OK with that. (On a lighter note, I once let a frustrated gondoliere know that I understood his under-the-breath foul comment. We laughed and talked, and he offered us an extended ride in his beautiful gondola as an apology, half price!)

1 Like

Violence is inacceptable

The subjectivity of this being included is what has me flummoxed.

In Italian all conjugations of the verbs “to die” and “to commit suicide” are in NSFW. The former just boggles me, as it’s a core verb. But setting that aside, verbs for “to fight” are not in “NSFW.” Nor is the word for assasin. Bomb, yes. Sword, no. War, no. I guess the sentiment is still: “Dulce et decorum est pro patria mori.”