Comment by danielhanchen
1 day ago
Oh yep! The deepseek paper also mentioned how large enough LLMs inherently have responding capabilities and the goal of GRPO is to accentuate latent skills!
1 day ago
Oh yep! The deepseek paper also mentioned how large enough LLMs inherently have responding capabilities and the goal of GRPO is to accentuate latent skills!
No comments yet
Contribute on Hacker News ↗