Comment by danielhanchen
17 hours ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
17 hours ago
Yep so GRPO is much more memory efficient than PPO, but other RL type algorithms can work fine as well!
No comments yet
Contribute on Hacker News ↗