Comment by simianwords

11 hours ago

Lets stick to my challenge please - thinking version, find bullshit. If you can't, thats ok. Do you accept then under the constraints that the thinking version doesn't produce bullshit?

13 comments

simianwords

simoncion 11 hours ago

Given aphyr's vocation (and how very lucrative it is), and how years and years of his writing indicates that he's very devoted to getting a correct and complete answer when investigating a question, I find it hard to believe that he's not using a paid version of the LLMs. If I knew him, I'd ask and verify, but I don't, so I won't.

> Lets stick to my challenge please...

I did. Your challenge was literally:

  If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

father_phi's two-sentence question about the whether one can use a cup that's closed at the top and open at the bottom definitely counts. Given what I've mentioned about apyhr above, I expect he has already run your challenge on the fanciest-available version and reported on the results in the essay under discussion.

simianwords 11 hours ago
> Use the thinking version gpt5.4 (text) and tell me if it bullshits
This was what I said. Text! Despite me specifically asking for text, you've shown a voice example. Not sure why?
I believe you and I agree that GPT 5.4 thinking on text that fits < 4 pages never bullshits? Then we are good!
If we agree on this, I think the post doesn't capture this in spirit.
- simoncion 11 hours ago
  
  > This was what I said. Text!
  No, that's what you said after I provided an example of paid ChatGPT emitting complete bullshit from a two sentence prompt.
  The challenge you issued is at [0].
  [0] <https://news.ycombinator.com/item?id=47692592>
  
  9 replies →