Comment by _carbyau_

12 hours ago

> and unbelievably knowledgeable

They are knowledgeable in that so much information sits in their repository.

But less than perfect application of that information combined with the appearance of always perfect confidence can lead to problems.

I treat them like that one person in the office who always espouses alternate theories - trust it as far as I can verify it. This can be very handy for finding new paths of inquiry though!

And for better or worse it feels like the errors are being "pushed down" into smaller, more subtle spaces.

I asked ChatGPT a question about a made up character in a made up work and it came back with "I don’t actually have a reliable answer for that". Perfect.

On the other hand, I can ask it about varnishing a piece of wood and it will give a lovely table with options, tradeoffs, and Good/Ok/Bad ratings for each option, except the ratings can be a little off the mark. Same thing when asking what thickness cable is required to carry 15A in AU electrical work. Depending on the journey and line of questioning, you would either get 2.5mm^2 or 4mm^2.

Not wrong enough to kill someone, but wrong enough that you're forced to use it as a research tool rather than a trusted expert/guru.

  • I asked ChatGPT, Gemini, Grok and DeepSeek to tell me about a contemporary Scottish indie band that hasn’t had a lot of press coverage. ChatGPT, Gemini and Grok all gave good answers based on the small amount of press coverage they have had.

    DeepSeek however hallucinated a completely fictional band from 30 years ago, right down to album names, a hard luck story about how they’d been shafted by the industry (and by whom), made up names of the members and even their supposed subsequent collaborations with contemporary pop artists.

    I asked if it was telling the truth or making it up and it doubled down quite aggressively on claiming it was telling the truth. The whole thing was very detailed and convincing yet complete and utter bollocks.

    I understand the difference in the cost/parameters etc. but it was miles behind the other 3, in fact it wasn’t just behind it was hurtling in the opposite direction, while being incredibly plausible.

    • This is by no means unique to DeepSeek, and that it happened with specifically DeepSeek seems to be luck of the draw for you (in this case it's entirely possible the band's limited press coverage was not in DeepSeek's training data). You can easily run into it from trying to use ChatGPT as a Google search too. A couple of weeks ago I posed the question "Do any esoteric programming languages with X and Y traits exist?" and it generated three fictional languages while asserting they were real. Further prompting led it to generate great detail about their various features and tradeoffs, as well as making up the people responsible for creating the language and other aspects of the fictional languages' history.