Comment by CAP_NET_ADMIN

13 hours ago

1. Best harness? It ranks the worst with Opus in terminalbench: https://www.tbench.ai/leaderboard/terminal-bench/2.0?models=...

2. Mixed for the entire bun ecosystem, especially with the Rust, Anthropic-focused rewrite

3. Good, because Anthropic's SDK was one of the worst ones to use.

4. Deal with the guy that has a shit ton of compute around wasting money because no-one uses Grok and was frequently calling Anthropic "Misanthropic".

https://i.redd.it/kp4uy1egspjg1.png

5. Glorified marketer whose probably greatest achievement in pushing AI forward was instructing on CS 231n and coining the term vibe coding.

Yeah, on a roll.

6 comments

CAP_NET_ADMIN

porphyra 13 hours ago

You're not entirely wrong but your snide tone is annoying and unsuitable for this platform. Anyway,

1. Claude Code is widely used and beloved despite not benchmaxxing on the terminalbench like these harnesses that nobody has ever heard of or uses.

5. Karpathy's contributions are way more than CS 231n and coining vibe coding. In terms of pedagogy, his "zero to hero" videos, nanoGPT, etc, are all great. For actual work, he also built a great org at Tesla.

sunaookami 12 hours ago
NTA but Claude Code is everything but beloved. It's incredibly meh, very buggy (to that extend that customers were literally losing money), heavily vibecoded and all around just... bad. I appreciate it for kickstarting the whole terminal agent thing and I would still use it but only because Anthropic mandates it for using Claude with your subscription.
- CAP_NET_ADMIN 11 hours ago
  
  Yep, sadly the case, been using Codex CLI a lot lately and it somehow feels more... refined. Gemini is just tragic.
  
  2 replies →

jim33442 8 hours ago

I've never heard of any of those other harnesses