Comment by coliveira

5 days ago

The issue is that French, Italian, African, Japanese people shouldn't have the inconvenience of instructing the LLM tool to get the basic facts about their own culture. They should use an LLM that has already been trained like that by default. Nobody has obligation to use a tool that thinks it is talking to an American. If I go to Google for example I want to get facts about my own country in my own language.

9 comments

coliveira

cortesoft 5 days ago

Wouldn't those people be asking the questions in their own language in the first place? The model will reply in the language you use. This thread is about people asking for information about a language that is not the one they are messaging the LLM in

schubidubiduba 5 days ago
Even if the model will reply in my language, I often notice it searching in english. Or thinking in english. There's always something lost in translation. Sometimes it's just minor nuances. Other times it mangles the legal facts with those of other countries.
- Schlagbohrer 5 days ago
  
  This sounds like the problem of people calling "911" as the emergency number which they see in so much US-American media but which is not the emergency number in their own country.
  
  2 replies →
numpad0 5 days ago

They always sound like an obnoxious American tourist talking through a translator, the chatbot training dataset is the same and foundation models are always built with >50% American English data for some reason.

instagraham 5 days ago

>Nobody has obligation to use a tool that thinks it is talking to an American

Very very emphatic agree from my end, thanks.

TimTheTinker 5 days ago

> Nobody has obligation to use a tool that thinks it is talking to an American.

Then add top-level instructions saying what country you're from, what country you live in now, and which language you speak. This isn't that hard.

schubidubiduba 5 days ago

None of that even addresses the problem described, because none of the languages you mentioned would be French in the described example.