Comment by rideontime

9 months ago

I feel a little less worried about Elon being able to tweak Grok for the benefit of his own propaganda goals now that we can see how blatantly obvious it is when it happens.

17 comments

rideontime

tastyface 9 months ago

This is just a stupid trial run. Eventually, this type of propaganda will become far more subtle and insidious.

empath75 9 months ago

For whatever reason, all the LLMs of a certain size _seem_ to have a very strong sense of right and wrong. (I say "seem", because it's mostly consistent with what a person who had a strong sense of right and wrong would say, but who knows what is really going on inside.)

observationist 9 months ago

Similar things have happened to OpenAI and Claude - context gets leaked from somewhere it's not supposed to. In this case, the white refugees are trending; it's likely context is leaking in from grok checking the users feed and such.

Or you can pretend Elon Musk is a cartoon villain, whatever floats your boat.

rideontime 9 months ago
This very specific context? Multiple Grok replies suggest that it's being prompted with a particular image: https://x.com/grok/status/1922671571665310162
e: And since that reply is in the same thread, here's an example of it happening in a completely different one. Not difficult to find these. https://x.com/grok/status/1922682536762958026
- burkaman 9 months ago
  
  Yeah it really looks like someone added something about South Africa to the system prompt. Just scroll through its latest replies until you see one with an unprompted South Africa discussion, it won't take long: https://xcancel.com/grok/with_replies
jrflowers 9 months ago

> Or you can pretend Elon Musk is a cartoon villain
What do you think villains do in cartoons
wewtyflakes 9 months ago

Did we read the same thing? This seemed reminiscent of https://www.anthropic.com/news/golden-gate-claude, of which was an experiment done on purpose.
amanaplanacanal 9 months ago

Nah, not cartoon.
EnPissant 9 months ago
In addition, the reply doesn't even support Elon Musk's position. Clearly, this is either a bug, responding to a deleted tweet, or something else.
- dinfinity 9 months ago
  
  Except that it will trigger a lot of people to find that "Kill the Boer" song and will search for "south africa white genocide".
  Pretty sure most people won't come out of that with a particularly nuanced view of the situation in South Africa.
  Good manipulation is subtle.
  
  5 replies →
- int_19h 9 months ago
  
  It doesn't support Musk's position because Grok is smart enough to know when its system prompt is obvious bullshit.
subjectsigma 9 months ago

Elon Musk pretty much is a cartoon villain, and refugees are an important topic, but I think that’s almost irrelevant when considering the question at hand, which is whether or not the output from Grok is biased and inflammatory. I believe it is, but endless speculation about why is probably not a good idea when we’re talking about a literal nonsense generator. Nobody fucking understands why LLMs do half the things they do.
I think no matter the cause, users should demand better quality and/or switch to a different model. Or, you know, stop trusting a magical black box to think for them.