Comment by anthonybsd
3 days ago
It wasn't not a result of system prompt. When you fine tune a model on a large corpus of right-leaning text don't be surprised when neo-nazi tendencies inevitably emerge.
3 days ago
It wasn't not a result of system prompt. When you fine tune a model on a large corpus of right-leaning text don't be surprised when neo-nazi tendencies inevitably emerge.
It was though. Xai publishes their system prompts, and here's the commit that fixed it (a one line removal): https://github.com/xai-org/grok-prompts/commit/c5de4a14feb50...
If that one sentence in the system prompt is all it takes to steer a model into a complete white supremacy meltdown at the drop of a hat, I think that's a problem with the model!
The system prompt that Grok 4 uses added that line back. https://x.com/elder_plinius/status/1943171871400194231
Weird, the post and comments load for me before switching to "Unable to load page."
Disable JavaScript or log into GitHub
It still hasn't been turned back on, and that repo is provided by xAI themselves, so you need to trust that they're being honest with the situation.
The timing in relation to the Grok 4 launch is highly suspect. It seems much more like a publicity stunt. (Any news is good news?)
But, besides that, if that prompt change unleashed the very extreme Hitler-tweeting and arguably worse horrors (it wasn't all "haha, I'm mechahitler"), it's a definite sign of some really bizarre fine tuning on the model itself.
What a silly assumption in that prompt:
> You have access to real-time search tools, which should be used to confirm facts and fetch primary sources for current events.
xAI claims to publish their system prompts.
I don’t recall where they published the bit of prompt that kept bringing up “white genocide” in South Africa at inopportune times.
Or, disgruntled employee looking to make maximum impact the day before the Big Launch of v4. Both are likely reasons.
These disgruntled employee defenses aren't valid, IMO.
I remember when Ring, for years, including after being bought by Meta, had huge issues with employee stalking. Every employee had access to every camera. It happened multiple times, or, at least, to our knowledge.
But that's not a people problem, that's a technology problem. This is what happens when you store and transit video over the internet and centralize it, unencrypted. This is what happens when you have piss-poor permission control.
What I mean is, it says a lot about the product if "disgruntled employees" are able to sabotage it. You're a user, presumably paying - you should care about that. Because, if we all wait around for the day humans magically start acting good all the time, we'll be waiting for the heat death of the universe.
or pr department getting creative with using dog whistling for buzz
I really find it ironic that some people are still pushing the idea about the right dog whistling when out-and-out anti-semites on the left control major streaming platforms (twitch) and push major streamers who repeatedly encourage their viewers to harm jewish people through barely concealed threats (Hasan Piker and related).
The masks are off and it's pretty clear what reality is.
Where is xAI’s public apology, assurances this won’t happen again, etc.?
Musk seems mildly amused by the whole thing, not appalled or livid (as any normal leader would be).
More like a disgruntled Elon Musk that everyone isn't buying his White Supremacy evangelism, so he's turning the volume knob up to 11.