Comment by danielhanchen
4 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
4 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
No comments yet
Contribute on Hacker News ↗