← Back to context

Comment by beering

6 days ago

What do you mean by “pure language model”? The reasoning step is still just the LLM spitting out tokens and this was confirmed by Deepseek replicating the o models. There’s not also a proof verifier or something similar running alongside it according to the openai researchers.

If you mean pure as in there’s not additional training beyond the pretraining, I don’t think any model has been pure since gpt-3.5.

Local models you can get just the pretrained versions of, no RLHF. IIRC both Llama and Gemma make them available.