Comment by lyu07282

6 months ago

What what I was referring to is similar in concept, but I've seen both described in papers as distillation. What I meant was you take the output of a large model like GPT4 and use that as training data to fine-tune a smaller model.

1 comment

lyu07282

sota_pop 6 months ago

Yes, that does sound very similar. To my knowledge, isn’t that (effectively) how the latest DeepSeek breakthroughs were made? (i.e. by leveraging chatgpt outputs to provide feedback for training the likes of R1)