Comment by danielhanchen
7 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
7 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
No comments yet
Contribute on Hacker News ↗