Comment by qsort

2 months ago

Obviously modern harnesses have better features but I wouldn't say it invalidates the mental model. Simpler agents aren't that far behind in performance if the underlying model is the same, including very minimal ones with basic tools.

I'd say it's similar to how a "make your own relational DB" article might feature a basic B-tree with merge-joins. Yeah, obviously real engines have sophisticated planners, multiple join methods, bloom filters, etc., but the underlying mental model is still accurate.

11 comments

qsort

prodigycorp 2 months ago

You’re not wrong but I still think that the harness matters a lot when trying to accurately describe Claude Code.

Here’s a reframing:

If you asked people “what would you rather work with, today’s Claude Code harness with sonnet 3.7, or the 200 line agentic loop in the article with Opus 4.5, which would you choose?”

I suspect many people would choose 3.7 with the harness. Moreover, that is true, then I’d say the article is no longer useful for a modern understanding of Claude Code.

aszen 2 months ago
I don't think so, model improvements far outweigh any harness or tooling.
Look at https://github.com/SWE-agent/mini-swe-agent for proof
- prodigycorp 2 months ago
  
  Yes but people aren’t choosing CC because they are necessarily performance maximalists. They choose it because it has features that make it behave much more nicely as a pair programming assistant than mini-swe-agent.
  There’s a reason Cursor poached Boris Cherney and Cat Wu and Anthropic hired them back!
  
  2 replies →
rfw300 2 months ago
Any person who would choose 3.7 with a fancy harness has a very poor memory about how dramatically the model capabilities have improved between then and now.
- prodigycorp 2 months ago
  
  I’d be very interested in the performance of 3.7 decked out with web search, context7, a full suite of skills, and code quality hooks against opus 4.5 with none of those. I suspect it’s closer than you think!
  
  3 replies →
nl 2 months ago

This is SO wrong.
I actually wrote my own simple agent (with some twists) in part so I could compare models.
Opus 4.5 is in a completely different league to Sonnet 4.5, and 3.7 isn't even on the same planet.
I happily use my agent with Opus but there is no world in which I'd use a Sonnet 3.7 level model for anything beyond simple code completion.