Comment by root_axis

18 hours ago

> I think the solution in harder-to-verify cases is to provide AI (sub-)agents a really good set of instructions on a detailed set of guidelines of what it should do and in what ways it should think and explore and break down problems

Using LLMs to validate LLMs isn't a solution to this problem. If the system can't self-verify then there's no signal to tell the LLM that it's wrong. The LLM is fundamentally unreliable, that's why you need a self-verifying system to guide and constrain the token generation.