It's exhaust. Retrospectively, chat is essentially worthless. You're going to chase hallucinations down conversations that maybe didn't even impact code.
You tell the agent “go do thing A” the agent replies “sure thing buddy, I’ll do that”, noodles, then reports “I’ve done that thing!” MEANWHILE, in reality, the agent has done something totally different—maybe they did a subset, failed completely, made an unrelated change.
Later, you find and FIX the problem but the chat has no record of it because there is *genuinely no point* to telling an agent “you screwed that up,” unless you want that agent to fix it.
Now that session has a completely fictitious story which will seem to correspond with reality only because of out of band action. It’s worse than worthless!
Session chat has only a tenuous and poorly marked match to reality, there is no reason to preserve it.
It's exhaust. Retrospectively, chat is essentially worthless. You're going to chase hallucinations down conversations that maybe didn't even impact code.
I have difficulty believing chat is worthless.
And I think that not everyone will entertain or chase the hallucinations down. Or maybe enough non-hallucinations are chased that it is valuable.
You seem to be thinking like it is 2020 and humans will be the ones reading the chat. It is just context-bloat for whatever agent ends up reading it.
The problem is “hallucination” might mean this:
You tell the agent “go do thing A” the agent replies “sure thing buddy, I’ll do that”, noodles, then reports “I’ve done that thing!” MEANWHILE, in reality, the agent has done something totally different—maybe they did a subset, failed completely, made an unrelated change.
Later, you find and FIX the problem but the chat has no record of it because there is *genuinely no point* to telling an agent “you screwed that up,” unless you want that agent to fix it.
Now that session has a completely fictitious story which will seem to correspond with reality only because of out of band action. It’s worse than worthless!
Session chat has only a tenuous and poorly marked match to reality, there is no reason to preserve it.