Comment by astrange
1 month ago
Note RLHF can only perform selection on existing model outputs, adding new data is SFT or else just more pretraining.
ChatGPT speaking African English was mostly just 3.5. 4o speaks like a TikTok user from LA. 5 seems kind of generic.
No comments yet
Contribute on Hacker News ↗