← Back to context Comment by throawayonthe 13 hours ago well there is https://artificialanalysis.ai/evaluations/omniscience 1 comment throawayonthe Reply goldenarm 12 hours ago It's a gibberish input detection benchmark, and does not measure output hallucinations.
goldenarm 12 hours ago It's a gibberish input detection benchmark, and does not measure output hallucinations.
It's a gibberish input detection benchmark, and does not measure output hallucinations.