Comment by nopinsight
4 hours ago
I assume you're using the "regular" Pro version of Gemini 3.1 for the above, rather than the Deep Think mode, which is more comparable to GPT-5.5 Pro. To my knowledge, regular 3.1 Pro is a tier below and often makes mistakes.
Moreover, there's no reason to believe the progress of LLMs, which couldn't reliably solve high-school math problems just 3–4 years ago, will stop anytime soon.
You might want to track the progress of these models on the CritPt benchmark, which is built on *unpublished, research-level* physics problems:
Frontier models are still nowhere near solving it, but progress has been rapid.
* o3 (high) <1.5 years ago was at 1.4%
* GPT 5.4 (xhigh), 23.4%
* GPT-5.5 (xhigh), 27.1%
* GPT-5.5 Pro (xhigh) 30.6%.
> there's no reason to believe the progress of LLMs [...] will stop anytime soon
Wrong. Every advancement has followed a s curve. Where we are on that curve is anyones guess. Or maybe "this time its different".
He said "will stop anytime soon". He didn't say forever.
Which still makes no sense. There is the same chance we are flatlining now as that we are flatlining in e.g. 3 years or 5 years.
1 reply →
There are many indications that model progress is slowing down, so that is not entirely accurate.
Which indications are that?
Nobody is releasing NEW models
1 reply →
Investment dollars.
1 reply →