Comment by black_puppydog

5 days ago

I task GPT/Claude with researching stuff that pertains to very specific cultural or legal aspects in French politics, on a daily basis. Even though French is a way more common language globally than Norwegian, these models still haven't figured out that, no matter the language I myself speak to them (German or English depending on my mood) their web searches need to be done in French to return reasonable results. I have to remind them every time lest they come back with "uh, didn't find anything relevant, here take some hallucinations instead."

So, given the anglo-centrism of current models, my confidence in American providers giving any shits about non-american users/use-cases is pretty low. And lower the smaller the language community is.

I've noticed that it also imposes american moral judgements on certain things, even though it reasons (sometimes) in the native language.

I was trying to work out how and when to use swear words, and the relative power index of them. it translated english swear words into the target language then lectured me on not using them.

It took a bunch of prodding for it to actually think as the target language to then get the (mostly) correct response.

  • Would be curious about the model and the prompt for this.

    Not kidding at all. I had a similar issue with a project where I needed to classify images into specific demographics, and Gemini, while capable, was entirely not going to do the task… until in my JSON response I left room for it to tell me why this was not a good idea and why it was culturally insensitive. Then boom… full JSON array: hair color, eye color, skin color, fitness level, likely ethnicity, likely country of origin, and about 10 other values.

    You’re probably wondering what on earth I was working on. I was matching Ai gen headshots to Ai voices so that in an app the voice picker had human (Ai) faces.

Aren’t you already using English in the LLM convo? Telling the model to use French for research or to find resources in French seems like a reasonable step.

If you’re doing this on a daily basis, then you should have an AGENTS.md that accumulates directional instructions like this.

This is how you use the tool correctly.

There’s this weird pattern I’ve noticed where people expect LLMs to require zero effort or proficiency on their part, and when the LLM isn’t perfect without it, of course it wasn’t; LLMs suck.

  • The issue is that French, Italian, African, Japanese people shouldn't have the inconvenience of instructing the LLM tool to get the basic facts about their own culture. They should use an LLM that has already been trained like that by default. Nobody has obligation to use a tool that thinks it is talking to an American. If I go to Google for example I want to get facts about my own country in my own language.

    • Wouldn't those people be asking the questions in their own language in the first place? The model will reply in the language you use. This thread is about people asking for information about a language that is not the one they are messaging the LLM in

      5 replies →

    • >Nobody has obligation to use a tool that thinks it is talking to an American

      Very very emphatic agree from my end, thanks.

    • > Nobody has obligation to use a tool that thinks it is talking to an American.

      Then add top-level instructions saying what country you're from, what country you live in now, and which language you speak. This isn't that hard.

      1 reply →

  • > Aren’t you already using English in the LLM convo? Telling the model to use French for research or to find resources in French seems like a reasonable step.

    Most ordinary people will just use their native language and they have no way of knowing that the model always reasons in English and therefore is strongly biased toward using English search terms. So they don't know they have to remind the model to search in their local language.

If you ask in French, it searches in French, right?

I have the opposite problem, where I'll ask in English, about something in a foreign country, the results it finds will all be in that foreign language, and the LLM will switch languages and respond in that language (which I don't speak).

So then I have to ask it "can you repeat that in English please."

I keep waiting for the new GPT-Definitelty-AGI-For-Real-This-Time to fix it but it's still there.

  • > If you ask in French, it searches in French, right?

    not necessarily. i often prompt Claude in German and then see the reasoning happening in English. of course it will eventually reply in German, but that does not mean that the tooling in the background was using German.

  • Same for me - I mostly ask stuff in English but sometimes add specific terms or names in Japanese as needed. My Japanese is intermediate, but it will often switch immediately and reply only and entirely in Japanese. I'm pretty sure they have a system prompt with hairline triggers for foreign languages BECAUSE of the overrepresentation of English in the training corpora.

> their web searches need to be done in French to return reasonable results.

I wonder how much of this is also just the search engine's region setting.

It's a big problem I regularly have with Google. I almost always want English language, US-centric results, so I have my region set to the US. But occasionally I want results relevant to my actual country, and even searching in my native language usually yields much worse results than just opening an incognito tab and letting it default to my real location.

  • I gave up on Google's language and region settings a long time ago, years before giving up on google as a product.

    To this day they still think I'm in Sweden sometimes, in Paris other times, or in Germany, while I haven't lived in any of those places for years.

Have you tried asking it to translate the prompt to French, and then feeding it the translated prompt?

I have the opposite problem. I often have to ask ChatGPT about things related to Norway and I have to constantly correct it when it keeps switching to responding in Norwegian no matter how many times I tell it to only answer in Norwegian when I request it.