Comment by gslepak

2 hours ago

> The non-hallucination rate in AA-omniscience is SOTA

Note that a perfect "non-hallucination rate" is rather meaningless as such tests can contain human hallucinations.

It means the model aligns with the possibly-true, possibly-false beliefs of the group that made the test.

2 comments

gslepak

Here are some examples of the questions in the benchmark. If these are representative, they seem pretty cut and dry. https://artificialanalysis.ai/evaluations/omniscience#exampl...

Well, yes, garbage in garbage out. That's a given and not what's meant by "hallucination" in this context.