Comment by tptacek

13 hours ago

What would I do to demonstrate that they are bad at math? If by "maths" we mean things like working out a double integral for a joint probability problem, or anything simpler than that, GPT5 has been flawless.

2 comments

tptacek

schneems 2 hours ago

Search the topic. It is historically documented. It might no longer be true though.

A way to test might be running an open model locally, directly (without a harness) where you could be sure it's not going through a translation layer. I think these days it might have this tool call behavior built in, but I think back in the day it was treated more like a magic trick. Without it, it behaved similar to "how many r's are in strawberry" for simple math.

tptacek 1 hour ago

It is wildly not true.
The request is for some reasonable math problem a model like GPT or Claude will fail at. I'm not going to set up a local model or some harness for it; I'm just going to copy/paste it into ChatGPT and watch it solve it.
Propose a problem, if you think I'm wrong about this. Seems simple.