Comment by afavour
5 hours ago
While I agree with what you’re saying the typical AI agent doesn’t say “I’m not totally sure about this, should I search the web?”. It often just spits out a reply based on its knowledge.
5 hours ago
While I agree with what you’re saying the typical AI agent doesn’t say “I’m not totally sure about this, should I search the web?”. It often just spits out a reply based on its knowledge.
That was true a year ago, I don't think it's true today. I can't remember the last time I saw Claude or ChatGPT confidently answer a question that they should have searched for instead.
If you watch their reasoning traces they often say things like "this is a well-known historical fact so I don't need to search for it", or more frequently they spit off a bunch of searches.
Anecdotally, it still happens a ton to me. They also still make super simple logic errors that they immediately reverse when pressed. For example, I asked Opus 4.7 last night how to cool off my room without making it too humid inside (indoor temp 78°F, humidity 45%; outdoor temp 64°F, humidity 99%). It suggested opening a window and assured me that the humidity would not rise above around 60% which would still be comfortable. I asked it to justify that and it said:
>You're absolutely right about the humidity — I was sloppy with that aside. If you ventilate enough to meaningfully cool the room, you're replacing indoor air with outdoor air wholesale, and you'd converge on outdoor conditions: 64°F and near-100% RH. That's miserable. The 55-60% figure I tossed out was hand-wavy nonsense — it would only hold if you barely cracked the window and mixed a tiny fraction of outdoor air in. At any ventilation rate that actually cools, you're just moving outside air inside.