Comment by simianwords

8 hours ago

Lets stick to my challenge please - thinking version, find bullshit. If you can't, thats ok. Do you accept then under the constraints that the thinking version doesn't produce bullshit?

13 comments

simianwords

simoncion 7 hours ago

Given aphyr's vocation (and how very lucrative it is), and how years and years of his writing indicates that he's very devoted to getting a correct and complete answer when investigating a question, I find it hard to believe that he's not using a paid version of the LLMs. If I knew him, I'd ask and verify, but I don't, so I won't.

> Lets stick to my challenge please...

I did. Your challenge was literally:

  If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

father_phi's two-sentence question about the whether one can use a cup that's closed at the top and open at the bottom definitely counts. Given what I've mentioned about apyhr above, I expect he has already run your challenge on the fanciest-available version and reported on the results in the essay under discussion.

simianwords 7 hours ago
> Use the thinking version gpt5.4 (text) and tell me if it bullshits
This was what I said. Text! Despite me specifically asking for text, you've shown a voice example. Not sure why?
I believe you and I agree that GPT 5.4 thinking on text that fits < 4 pages never bullshits? Then we are good!
If we agree on this, I think the post doesn't capture this in spirit.
- simoncion 7 hours ago
  
  > This was what I said. Text!
  No, that's what you said after I provided an example of paid ChatGPT emitting complete bullshit from a two sentence prompt.
  The challenge you issued is at [0].
  [0] <https://news.ycombinator.com/item?id=47692592>
  
  9 replies →