← Back to context

Comment by jjani

3 days ago

Power user here, working with these models (the whole gamut) side-by-side on a large range of tasks has been my daily work since they came out.

I can vouch that this is extremely characteristic of o3-mini compared to competing models (Claude, Gemini) and previous OA models (3.5, 4o).

Compared to those, o3-mini clearly has less of the "the user is always right" training. This is almost certainly intentional. At times, this can be useful - it's more willing to call you out when you're wrong, and less likely to agree with something just because you suggested it. But this excessive stubbornness is the great downside, and it's been so prevalent that I stopped using o3-mini.

I haven't had enough time with o3 yet, but if it is indeed an evolution of o3-mini, it comes at no surprise it's very bad for this as well.

Yes! I always ask these models a simple question, that all models don't have the right answers.

"List of mayors of my City X".

All OF THEM, get it wrong. Hallucinate the names, wrong dates, etc. The list is on wikipedia, and for sure they trained on that data, but they are not able to answer properly.

o3-mini? It just says it doesn't know lol

  • Yeah, that's the big upside for sure - it baseline hallucinates less. But when it does, it's very assertive in gaslighting you that it's hallucination is in fact the truth, it can't "fix" its own errors. I've found this tradeoff not to be worth it for general use.

Sounds like we're getting closer and closer to an AI that acts like a human ;-)