Comment by observationist
7 months ago
Similar things have happened to OpenAI and Claude - context gets leaked from somewhere it's not supposed to. In this case, the white refugees are trending; it's likely context is leaking in from grok checking the users feed and such.
Or you can pretend Elon Musk is a cartoon villain, whatever floats your boat.
This very specific context? Multiple Grok replies suggest that it's being prompted with a particular image: https://x.com/grok/status/1922671571665310162
e: And since that reply is in the same thread, here's an example of it happening in a completely different one. Not difficult to find these. https://x.com/grok/status/1922682536762958026
Yeah it really looks like someone added something about South Africa to the system prompt. Just scroll through its latest replies until you see one with an unprompted South Africa discussion, it won't take long: https://xcancel.com/grok/with_replies
> Or you can pretend Elon Musk is a cartoon villain
What do you think villains do in cartoons
Did we read the same thing? This seemed reminiscent of https://www.anthropic.com/news/golden-gate-claude, of which was an experiment done on purpose.
Nah, not cartoon.
In addition, the reply doesn't even support Elon Musk's position. Clearly, this is either a bug, responding to a deleted tweet, or something else.
Except that it will trigger a lot of people to find that "Kill the Boer" song and will search for "south africa white genocide".
Pretty sure most people won't come out of that with a particularly nuanced view of the situation in South Africa.
Good manipulation is subtle.
We must have different definitions of subtle.
2 replies →
Excuse me, are you suggesting that any amount of "nuance" could make these acceptable? Or that people "finding" out about it is a bad thing?
1 reply →
It doesn't support Musk's position because Grok is smart enough to know when its system prompt is obvious bullshit.
Elon Musk pretty much is a cartoon villain, and refugees are an important topic, but I think that’s almost irrelevant when considering the question at hand, which is whether or not the output from Grok is biased and inflammatory. I believe it is, but endless speculation about why is probably not a good idea when we’re talking about a literal nonsense generator. Nobody fucking understands why LLMs do half the things they do.
I think no matter the cause, users should demand better quality and/or switch to a different model. Or, you know, stop trusting a magical black box to think for them.