← Back to context

Comment by hebejebelus

14 hours ago

Seems like a postprocess step on the initial output would fix that kind of thing - maybe a small 'thinking' step that transforms the initial output to match style.

Yeah, that's how it would be implemented after a filter fail, but it's important that the filter itself be separate from the agent, so it can be deterministic. Some problems, like "genuine," are so baked in to the models that they will persist even if instructed not to, so a dumb filter, a la a pre-commit hook, is the only way to stop it consistently.