Comment by fnord123
4 days ago
> the AA-Omniscience Hallucination Rate Benchmark which puts 3.0 Pro among the higher hallucinating models. 3.1 seems to be a noticeable improvement though.
As sibling comment says, AA-Omniscience Hallucination Rate Benchmark puts Gemini 3.0 as the best performing aside from Gemini 3.1 preview.
You are misreading the benchmark.
https://artificialanalysis.ai/#aa-omniscience-hallucination-...
If you look at the results 3.0 hallucinates an awful lot, when it's wrong.
It's just not wrong that often.
(And it looks like 3.1 does better on both fronts)