← Back to context

Comment by simianwords

2 hours ago

This was my original point

>I don't think calling AI a bullshit machine is correct. In spirit.

That was always my goal post and I asked the challenge to get it to bullshit to drive a point across. You yourself said it is trivial.

1. You came up with the horns question - I tried with the thinking model and it clearly understood that it was a joke and replied appropriately

2. You came up with the assembly question - I tried it again with the thinking model and it gave the right answer again

3. Now you gave up trying to make prompts by yourself because you realised that its in fact not trivial

4. Then you started looking for benchmarks to show that it bullshits

5. You picked a benchmark that doesn't allow tools (which was not my constraint)

6. Then you picked a benchmark that does allow tools, and it turns out that it performs much better than humans

7. Upon hearing this, you shifted to goal posts to say that "models don't know how to say I don't know and I can teach models etc etc"

On the last part: There's a benchmark called SimpleQA which doesn't allow tools and allows for "I don't know" as an answer and GPT 5 still beats humans.

I think you should reconsider thinking this "I don't think calling AI a bullshit machine is correct".