Comment by Palmik
9 hours ago
All evals on Terminal Bench require some harness. :) Or "Agent", as Terminal Bench calls it. Presumably the Gemini 3 are using Gemini CLI.
9 hours ago
All evals on Terminal Bench require some harness. :) Or "Agent", as Terminal Bench calls it. Presumably the Gemini 3 are using Gemini CLI.
No comments yet
Contribute on Hacker News ↗