Comment by danielhanchen
1 day ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
1 day ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
No comments yet
Contribute on Hacker News ↗