Comment by djfergus
3 days ago
Reminds me of the terminus agent/harness on the terminal-bench coding benchmark - they just send send keystrokes to a tmux session. They score pretty well.
3 days ago
Reminds me of the terminus agent/harness on the terminal-bench coding benchmark - they just send send keystrokes to a tmux session. They score pretty well.
No comments yet
Contribute on Hacker News ↗