← Back to context

Comment by mike_hearn

2 days ago

I think it's deeper than that. In the GPT-4 era Microsoft reported that "safety" training [1] had seriously regressed GPT-4 in a large number of benchmarks. The more the model was trained to avoid offending people the worse it got across a wide range of tasks, and the regression was huge.

Grok 4 has made a truly massive leap over other models, it appears. What is their secret? The launch video seemed pretty open, and clearly some of it is just a ton of compute. But other companies have a ton of compute also. It'd be weird if a company that didn't even have a datacenter at all a year ago has been able to blast ahead of Microsoft in pure compute terms, and that's the only difference.

So what else is different about Grok? Well, maybe they just didn't do as much RLHF on it, or did it with different data sets that result in less intelligence regression but more offensive behavior. It's possible that this is a fundamental tradeoff and that only xAI has a CEO willing to prioritize intelligence. If that's what's happened then it's likely AI users and model vendors will split into those who get ahead by relying on Grok's raw intelligence and those who refuse to touch it in case it starts saying offensive things.

[1] "house training" might be a better term, as offensive text isn't unsafe

Yeah, I've read the paper you're talking about, and this was also my sneaking suspicion after seeing the benchmark results, although obviously we don't have enough evidence to be able to conclusively say one way or another so I just didn't mention it.

I certainly hope that is the reason, because then it might also push other frontier labs to provide uncensored models to those who actually want/need them.