← Back to context

Comment by yahoozoo

3 days ago

How are they doing this? Does it just make heavy use of web searches? A continuously updated RAG store? Why don’t other companies do it?

Nothing stops you continuously training a foundation model and serving checkpoints, but historically there were weird cliffs and instabilities where more training would make things worse rather than better. The trick is to introduce more data into the pre-training mix and keep training in ways that don't cause the model to regress. Presumably they've figured that out.

It's probably enabled by the huge datacenter xAI has. Most AI labs haven't built their own datacenter, and have to choose between doing experiments on new architectures, serving live traffic and doing more training on their existing models. Perhaps xAI can do all three simultaneously.