Comment by danoandco
9 hours ago
Similar but reusing lab-native CLIs like Claude Code or Codex, which they perform RL on. And so in the long-run, we believe this approach wins over custom harnesses.
9 hours ago
Similar but reusing lab-native CLIs like Claude Code or Codex, which they perform RL on. And so in the long-run, we believe this approach wins over custom harnesses.
No comments yet
Contribute on Hacker News ↗