Comment by golol
1 year ago
Because it's a straight forward stochastic sequence modelling task and I've seen GPT-3.5-turbo-instruct play at high amateur level myself. But it seems like all the RLHF and distillation that is done on newer models destroys that ability.
No comments yet
Contribute on Hacker News ↗