Comment by nl

3 days ago

> what Alibaba spends their time doing?

Not most of the time (pre-training takes a long time), but post-training is where most of the value is, yes.

Famously it is all that OpenAI did between GPT 4o and GPT 5.3 (or 5.2?) - they didn't manage to complete a pre-training run[1], and all their progress was done with post-training (!)

Post training what Cursor spends their time doing, and that has built a model that is competitive with the best coding models out there.

It isn't limited at all.

If you want to complain about something not being open source, complain about the lack of good open source RL environments (Prime Intellect excepted).

[1] https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-s...

> It isn't limited at all.

Your very message is already showing that indeed it is limited, so dunno where you get that "that's where most of the value is". It is definitely not , and your very own link is showing that the the limit is there. This is not to say that there is _no_ value whatsoever, but that the value is negligible compared to what someone with _the real source_ could do. See the Rio model for another example.

> If you want to complain about something not being open source, complain about the lack of good open source RL environments (Prime Intellect excepted).

This is implicitly included when I was emphasizing "AND the software used to build the model", which I did for a reason.

  • > See the Rio model for another example

    The Rio model (assuming you mean https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B) is a merge of Nex-N2_pro and Qwen: https://github.com/nex-agi/Nex-N2/issues/4

    Neither of these ship training data.

    > This is implicitly included when I was emphasizing "AND the software used to build the model", which I did for a reason.

    If that's what you meant then sure, I agree.

    Most providers do this though (at least to some extent) - the problem is that they can't eg ship Excel in their RL environment, and AFAIK there aren't any with alternatives.