Comment by disgruntledphd2

6 months ago

> Except when they "extract" something that wasn't in the source. And now what, assuming you can even detect the tainted data at all?

I mean, this is much less common than people make it out to be. Assuming that the context is there it's doable to run a bunch of calls and take the majority vote. It's not trivial but this is definitely doable.

I really don’t think that’s doable because why do you the majority output is correct? It’s just as likely to be a hallucination.

If he problem is the system has no concept of correctness or world model.

  • Assuming that hallucinationd are relatively random it's true. I do believe that they happen less often when you feed the model decent context though.

I mean, it is obvious for a human inspecting the one specific input and output sample, but how do you do this at scale? (Spoiler: cross your fingers and hope, that's how)