← Back to context

Comment by mtrifonov

13 days ago

The interesting thing about agent tool use is how binary it is. The agent either calls the tool or doesn't. The harder problem is social agency, where the AI has to decide whether to participate at all. We built a pre-filter for this (cheap model reads the room before the expensive model runs) and the failure modes are fascinating. The model would reason correctly in its chain-of-thought, 'this person is left hanging, I should respond' and then output the opposite boolean. Turned out Haiku has a systematic false-bias on boolean tool outputs. Had to invert the schema