Comment by littlestymaar

4 days ago

ChatGPT was released two and a half years ago though. Pretty sure that at some point Sam Altman had promised us AGI by now.

The person you're responding to is correct that OpenAI feels a lot more stagnant than other players (like Google, which was nowhere to be seen even one year and a half ago and now has the leading model on pretty much every metric, but also DeepSeek, who built a competitive model in a year that runs for much cheaper).

Google has the leading model on pretty much every metric

Correction: Google had the leading model for three weeks. Today it’s back to the second place.

  • press X to doubt

    o3-mini wasn't even the second place for non-STEM tasks, and in today's announcement they don't even publish benchmarks for those. What's impressive about Gemini 2.5 pro (and was also really impressive with R1) is how good the model is for a very broad range of tasks, not just benchmaxing on AIME.

    • I had a philosophical discussion with o3 model earlier today. It was much better than 2.5 pro. In fact it was pretty much what I would expect from a professional philosopher.

      4 replies →