← Back to context

Comment by rideontime

7 months ago

I feel a little less worried about Elon being able to tweak Grok for the benefit of his own propaganda goals now that we can see how blatantly obvious it is when it happens.

This is just a stupid trial run. Eventually, this type of propaganda will become far more subtle and insidious.

For whatever reason, all the LLMs of a certain size _seem_ to have a very strong sense of right and wrong. (I say "seem", because it's mostly consistent with what a person who had a strong sense of right and wrong would say, but who knows what is really going on inside.)

Similar things have happened to OpenAI and Claude - context gets leaked from somewhere it's not supposed to. In this case, the white refugees are trending; it's likely context is leaking in from grok checking the users feed and such.

Or you can pretend Elon Musk is a cartoon villain, whatever floats your boat.

  • This very specific context? Multiple Grok replies suggest that it's being prompted with a particular image: https://x.com/grok/status/1922671571665310162

    e: And since that reply is in the same thread, here's an example of it happening in a completely different one. Not difficult to find these. https://x.com/grok/status/1922682536762958026

  • In addition, the reply doesn't even support Elon Musk's position. Clearly, this is either a bug, responding to a deleted tweet, or something else.

    • Except that it will trigger a lot of people to find that "Kill the Boer" song and will search for "south africa white genocide".

      Pretty sure most people won't come out of that with a particularly nuanced view of the situation in South Africa.

      Good manipulation is subtle.

      5 replies →

    • It doesn't support Musk's position because Grok is smart enough to know when its system prompt is obvious bullshit.

  • Elon Musk pretty much is a cartoon villain, and refugees are an important topic, but I think that’s almost irrelevant when considering the question at hand, which is whether or not the output from Grok is biased and inflammatory. I believe it is, but endless speculation about why is probably not a good idea when we’re talking about a literal nonsense generator. Nobody fucking understands why LLMs do half the things they do.

    I think no matter the cause, users should demand better quality and/or switch to a different model. Or, you know, stop trusting a magical black box to think for them.