Comment by simoncion
6 hours ago
> If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)?
There's an entire paragraph in the essay about apyhr's direct experience with ChatGPT failures and sustained bullshitting that we'd never expect from a moderately-skilled human who possesses at least two functioning braincells. That paragraph begins "I have recently argued for forty-five minutes with ChatGPT". Do notice that there are six sentences in the paragraph. I encourage you to read all of them (make sure to check out the footnote... it's pretty good).
The exact text of the ChatGPT session is irrelevant; even if you reported that you were unable to reproduce the issue, it would only reinforce one of the underlying points -namely- that these systems are unreliable. aphyr has a pretty extensive body of published work that indicates that he'd not likely fabricate a story of an LLM repeatedly failing to accomplish a task that any moderately-skilled human could accomplish when equipped with the proper tools. So, I believe that his report is true and accurate.
There's also this seven-week-old example [0] (linked in the essay) of ChatGPT very confidently recommending a asinine course of action because it was unable to understand what the hell it was being told.
Listening to the audio is not required, as there's a reasonably accurate on-screen transcript, but it is valuable to listen to just how very hard they've worked to make this tool sound both confident and capable, even in situations where it's soul-crushingly incorrect. Those of us who have worked in Blasted Corporate Hellscapes may recognize how this manner of speaking can be very, very compelling to a certain sort of person (who -as it turns out- is frequently found in a management position).
[0] <https://www.instagram.com/reel/DUylL79kvub/>
This is classic case of not using the proper version. Use the thinking version gpt5.4 (text) and tell me if it bullshits.
Surely you must be able to find at least one example no?
To be clear, is your assertion that apyhr was also not using the proper version? If that is your assertion, do tell me how you've come by that information.
(You did notice that the author of the essay and the author of the video I linked to are not the same person, and that neither of them share a nym with me, yes?)
16 replies →