Comment by 55555

1 day ago

I disagree because of how AI is progressing and because there's tons of neglected language markets they can pick up. Obviously your approach can work too, perhaps better. But 95% of language learning tools don't support Thai (my target language) for example so I am an eager user for that reason alone. I think they'll be able to make a generalized curriculum and have the AI use it in all languages.

Most of the generalized curriculum stuff out there is crap because languages differ from each other in substantial ways. LLMs in principle should help here as they can use their knowledge of the structure of the language to modify, but we're just not there with context windows and thinking capabilities. They will need at least a per-language (ideally per language pair) system prompt that contains a rough outline of the curriculum.

  • I think the curriculum areas you're referring to are for learners in the beginning and intermediate stages. In which case, fair enough, although I still think you could get pretty far by just prompting an LLM, as the LLM has read hundreds of books teaching how to learn each language. But that's not really my point; my point is that once you're an advanced learner (they claim this is their target market) who knows about 12,000 words, I think you know almost all of the grammar, and the remaining bits will get picked up along the way effortlessly via immersion. What you need help with in this stage is slogging through the next 10,000 vocab words you need to learn to get to extreme fluency or the next 25,000 you need to learn to become plausibly native-level, as well as the speaking and reading practice to make your reading faster (if it's a different character set to your native language) and make your speaking effortless.

Would you rather have a tool that teaches you accurate conversational Spanish ?

Or something that tries to teach 60 languages but does so poorly ?

My tool supports Thai, if you'd like to try it - https://nuenki.app . I added it at the request of a user, who seems to be happy with it.

It's a browser extension that finds English sentences in webpages, and translates the ones at your difficulty level into the language you're learning.

  • Thank you, I will try it, although I'd prefer to translate entire sentences into Thai randomly. Perhaps you can add this advanced mode. Actually, I saw your app before while looking for an alternative to Toucan that supported Thai, but at that point in time you hadn't added support yet. Thanks for doing so.

    • Okay I installed it and this is pretty great. Although I think your extension doesn't work for Thai the way you think it does. Because there's spaces between sentences instead of between words in Thai, it's translating entire sentences even with the "words only" setting enabled. This is what I want anyways, but will be too difficult for most learners. I have written misc Thai learning softwares and just so you know you should use an LLM to do word-splitting, not a software library. If you do use a library, you need to split words while looking for the largest possible word, but it won't work well. Basically you can't tell without a brain whether it's a lot of small words next to one another or a smaller number of compound words. IME only an LLM or a human will do a good job of this.

      1 reply →

  • That's a pretty sick idea. Unfortunately I presume it involves sending your browsing data (e.g. page contents) to the server?

    • Yeah, though I've added lots of privacy protections to at least partially mitigate that:

      - There's a global blacklist of sites, as well as phrases in the title/URL (e.g. "bank")

      - You can blacklist sites yourself

      - Each sentence is run against filters checking for medical/legal/etc info, as well as checks for addresses, card/social security numbers, etc. All the checks are done client side

      - There are also some special implementations, e.g. it looks at the source code of websites to work out if they're an instance of an American health portal that I've forgotten the name of - each doctor's surgery self-hosts it.

      - Websites can add `nuenki-ignore=true` on their end, if they'd like to disable it.

      And of course it doesn't log anything, though there is an anonymous cache in order to make it economical.

      4 replies →