Comment by danielhanchen
10 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
10 months ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
No comments yet
Contribute on Hacker News ↗