← Back to context

Comment by joshvm

1 day ago

I've found that for any sort of reasonable task, the free models are garbage and the low-tier paid models aren't much better. I'm not talking about coding, just general "help me" usage. It makes me very wary of using these models for anything that I don't fully understand, because I continually get easily falsifiable hallucinations.

Today, I asked Gemini 3 to find me a power supply with some spec; AC/DC +/- 15V/3A. It did a good job of spec extraction from the PDF datasheets I provided, including looking up how the device performance would degrade using a linear vs switch-mode PSU. But then it comes back with two models from Traco that don't exist, including broken URLs to Mouser. It did suggest running two Meanwell power supplies in series (valid), but 2/3 suggestions were BS. This sort of failure is particularly frustrating because it should be easy and the outputs are also very easy to test against.

Perhaps this is where you need a second agent to verify and report back, so a human doesn't waste the time?