Comment by nl
16 hours ago
Model distillation is very useful!
Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.
16 hours ago
Model distillation is very useful!
Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.
No comments yet
Contribute on Hacker News ↗