Comment by logicprog

6 days ago

Is it me or is the rate of model release is accelerating to an absurd degree? Today we have Gemini 3 Deep Think and GPT 5.3 Codex Spark. Yesterday we had GLM5 and MiniMax M2.5. Five days before that we had Opus 4.6 and GPT 5.3. Then maybe two weeks I think before that we had Kimi K2.5.

79 comments

logicprog

i5heu 6 days ago

I think it is because of the Chinese new year. The Chinese labs like to publish their models arround the Chinese new year, and the US labs do not want to let a DeepSeek R1 (20 January 2025) impact event happen again, so i guess they publish models that are more capable then what they imagine Chinese labs are yet capable of producing.

woah 6 days ago
Singularity or just Chinese New Year?
- syndacks 5 days ago
  
  The Singularity will occur on a Tuesday, during Chinese New Year
kristopolous 5 days ago
I guess. Deepseek v3 was released on boxing day a month prior
https://api-docs.deepseek.com/news/news1226
- littlestymaar 5 days ago
  
  And made almost zero impact, it was just a bigger version of Deepseek V2 and when mostly unnoticed because its performances weren't particularly notable especially for its size.
  It was R1 with its RL-training that made the news and crashed the srock market.
dboreham 5 days ago
Aren't we saying "lunar new year" now?
- Cyphase 5 days ago
  
  I don't think so; there are different lunar calendars.
  
  3 replies →
r2vcap 5 days ago
[flagged]
- rfoo 5 days ago
  
  For another example, Singapore, one of the "many Asian countries" you mentioned, list "Chinese New Year" as the official name on government websites. [0] Also note that both California and New York is not located in Asia.
  And don't get me started with "Lunar New Year? What Lunar New Year? Islamic Lunar New Year? Jewish Lunar New Year? CHINESE Lunar New Year?".
  [0] https://www.mom.gov.sg/employment-practices/public-holidays
- janalsncm 5 days ago
  
  “Lunar New Year” is vague when referring to the holiday as observed by Chinese labs in China. Chinese people don’t call it Lunar New Year or Chinese New Year anyways. They call it Spring Festival (春节).
  As it turns out, people in China don’t name their holidays based off of what the laws of New York or California say.
- triceratops 5 days ago
  
  Please don't because "Lunar New Year" is ambiguous. Many other Asian cultures also have traditional lunar calendars but a different new years day. It's a bit presumptuous to claim that this is the sole "Lunar New Year" celebration.
  https://en.wikipedia.org/wiki/Indian_New_Year%27s_days#Calen...
  https://en.wikipedia.org/wiki/Islamic_New_Year
  https://en.wikipedia.org/wiki/Nowruz
- zzrush 5 days ago
  
  I didn't expect language policing has reached such level. This is specifically related to China and DeepSeek who celebrates Chinese new year. Do you demand all Chinese to say happy luner new year to each other?
- phainopepla2 5 days ago
  
  "Happy Holidays" comes to the diaspora
  
  1 reply →
- jfengel 5 days ago
  
  "Lunar New Year" is perhaps over-general, since there are non-Asian lunar calendars, such as the Hebrew and Islamic calendars.
  That said, "Lunar New Year" is probably as good a compromise as any, since we have other names for the Hebrew and Islamic New Years.
  
  2 replies →
- 0x3f 5 days ago
  
  But they're Chinese companies specifically, in this case
- saubeidl 5 days ago
  
  Where do all of those Asian countries have that tradition from?
  Have you ever had a Polish Sausage? Did it make you Polish?

aliston 6 days ago

I'm having trouble just keeping track of all these different types of models.

Is "Gemini 3 Deep Think" even technically a model? From what I've gathered, it is built on top of Gemini 3 Pro, and appears to be adding specific thinking capabilities, more akin to adding subagents than a truly new foundational model like Opus 4.6.

Also, I don't understand the comments about Google being behind in agentic workflows. I know that the typical use of, say, Claude Code feels agentic, but also a lot of folks are using separate agent harnesses like OpenClaw anyway. You could just as easily plug Gemini 3 Pro into OpenClaw as you can Opus, right?

Can someone help me understand these distinctions? Very confused, especially regarding the agent terminology. Much appreciated!

janalsncm 5 days ago
The term “model” is one of those super overloaded terms. Depending on the conversation it can mean:
- a product (most accurate here imo)
- a specific set of weights in a neural net
- a general architecture or family of architectures (BERT models)
So while you could argue this is a “model” in the broadest sense of the term, it’s probably more descriptive to call it a product. Similarly we call LLMs “language” models even if they can do a lot more than that, for example draw images.
- cubefox 5 days ago
  
  I'm pretty sure only the second is properly called a model, and "BERT models" are simply models with the BERT architecture.
  
  4 replies →
logicprog 6 days ago

> Also, I don't understand the comments about Google being behind in agentic workflows.
It has to do with how the model is RL'd. It's not that Gemini can't be used with various agentic harnesses, like open code or open claw or theoretically even claude code. It's just that the model is trained less effectively to work with those harnesses, so it produces worse results.
re-thc 6 days ago

There are hints this is a preview to Gemini 3.1.
manmal 4 days ago

I have no proof, but these deep thinking modes feel to me like an orchestrator agent + sub agents, the former being RL‘d to just keep going instead of being conditioned to stop ASAP.

_heimdall 5 days ago

More focus has been put on post-training recently. Where a full model training run can take a month and often requires multiple tries because it can collapse and fail, post-training is don't on the order of 5 or 6 days.

My assumption is that they're all either pretty happy with their base models or unwilling to do those larger runs, and post-training is turning out good results that they release quickly.

sanderjd 5 days ago

So, yes, for the past couple weeks it has felt that way to me. But it seems to come in fits and starts. Maybe that will stop being the case, but that's how it's felt to me for awhile.

bpodgursky 6 days ago

Anthropic took the day off to do a $30B raise at a $380B valuation.

IhateAI 6 days ago
Most ridiculous valuation in the history of markets. Cant wait to watch these compsnies crash snd burn when people give up on the slot machine.
- andxor 6 days ago
  
  As usual don't take financial advice from HN folks!
  
  1 reply →
- kgwgk 6 days ago
  
  WeWork almost IPO’s at $50bn. It was also a nice crash and burn.
- jascha_eng 6 days ago
  
  Why? They had $10+ billion arr run rate in 2025 trippeled from 2024 I mean 30x is a lot but also not insane at that growth rate right?
  
  2 replies →

rogerkirkness 6 days ago

Fast takeoff.

brokencode 6 days ago

They are using the current models to help develop even smarter models. Each generation of model can help even more for the next generation.

I don’t think it’s hyperbolic to say that we may be only a single digit number of years away from the singularity.

lm28469 6 days ago
I must be holding these things wrong because I'm not seeing any of these God like superpowers everyone seem to enjoy.
- brokencode 6 days ago
  
  Who said they’re godlike today?
  And yes, you are probably using them wrong if you don’t find them useful or don’t see the rapid improvement.
  
  22 replies →
mrandish 5 days ago

> using the current models to help develop even smarter models.
That statement is plausible. However, extrapolating that to assert all the very different things which must be true to enable any form of 'singularity' would be a profound category error. There are many ways in which your first two sentences can be entirely true, while your third sentence requires a bunch of fundamental and extraordinary things to be true for which there is currently zero evidence.
Things like LLMs improving themselves in meaningful and novel ways and then iterating that self-improvement over multiple unattended generations in exponential runaway positive feedback loops resulting in tangible, real-world utility. All the impressive and rapid achievements in LLMs to date can still be true while major elements required for Foom-ish exponential take-off are still missing.
sekai 6 days ago
> I don’t think it’s hyperbolic to say that we may be only a single digit number of years away from the singularity.
We're back to singularity hype, but let's be real: benchmark gains are meaningless in the real world when the primary focus has shifted to gaming the metrics
- brokencode 5 days ago
  
  Ok, here I am living in the real world finding these models have advanced incredibly over the past year for coding.
  Benchmaxxing exists, but that’s not the only data point. It’s pretty clear that models are improving quickly in many domains in real world usage.
  
  4 replies →

redox99 6 days ago

There's more compute now than before.

killingtime74 5 days ago

They are spending literal trillions. It may even accelerate

bytesandbits 5 days ago

its cause of a chain of events.

Next week Chinese New year -> Chinese labs release all the models at once before it starts -> US labs respond with what they have already prepared

also note that even in US labs a large proportion of researchers and engineers are chinese and many celebrate the Chinese New Year too.

TLDR: Chinese New Year. Happy Horse year everybody!