Comment by reissbaker

1 month ago

The GPT-5 series is a new model, based on the o1/o3 series. It's very much inaccurate to say that it's a routing system and prompt chain built on top of 4o. 4o was not a reasoning model and reasoning prompts are very weak compared to actual RLVR training.

No one knows whether the base model has changed, but 4o was not a base model, and neither is 5.x. Although I would be kind of surprised if the base model hadn't also changed, FWIW: they've significantly advanced their synthetic data generation pipeline (as made obvious via their gpt-oss-120b release, which allegedly was entirely generated from their synthetic data pipelines), which is a little silly if they're not using it to augment pretraining/midtraining for the models they actually make money from. But either way, 5.x isn't just a prompt chain and routing on top of 4o.

2 comments

reissbaker

doug_durham 1 month ago

Prior to 5.2 you couldn’t expect to get good answers to questions prior to March 2024. It was arguing with me that Bruno Mars did not have two hit songs in the last year. It’s clear that in 2025 OpenAI used the old 4.0 base model and tried to supercharge it using RLVR. That had very mixed results.

brokencode 1 month ago

That just means their pretraining data set was older. You can train as many models as you want on the same data.
I’m sure all these AI labs have extensive data gathering, cleanup, and validation processes for new data they train the model on.
Or at least I hope they don’t just download the current state of the web on the day they need to start training the new model and cross their fingers.