← Back to context

Comment by vanuatu

13 days ago

I don't think its much of an issue

- Rl envs + synthetic data + human annotated

- Usage data from codex/claude code/cursor

Most of the model abilities in coding come from post-training, not pretraining

A better question is what's left for those who don't have access to that. We went from publicly available to vacuumed from private users