Comment by danoandco
6 hours ago
Similar but reusing lab-native CLIs like Claude Code or Codex, which they perform RL on. And so in the long-run, we believe this approach wins over custom harnesses.
6 hours ago
Similar but reusing lab-native CLIs like Claude Code or Codex, which they perform RL on. And so in the long-run, we believe this approach wins over custom harnesses.
No comments yet
Contribute on Hacker News ↗