Comment by joquarky

7 days ago

You seem to have a lot of theoretical knowledge on this, but have you tried Claude or codex in the past month or two?

Hands on experience is better than reading articles.

I've been coding for 40 years and after a few months getting familiar with these tools, this feels really big. Like how the internet felt in 1994.

I've been developing an ai coding harness https://github.com/dlants/magenta.nvim for over a year now, and I use it (and cursor and claude code) daily at work.

Fun observation - almost every coding harness (claude code, cursor, codex) uses a find/replace tool as the primary way of interacting with code. This requires the agent to fully type out the code it's trying to edit, including several lines of context around the edit. This is really inefficient, token wise! Why does it work this way? Because the LLMs are really bad at counting lines, or using other ways of describing a unique location in the file.

I've experimented with providing a more robust dsl for text manipulation https://github.com/dlants/magenta.nvim/blob/main/node/tools/... , and I do think it's an improvement over just straight search/replace, but the agents do tend to struggle a lot - editing the wrong line, messing up the selection state, etc... which is probably why the major players haven't adopted something like this yet.

So I feel pretty confident in my assessment of where these models are at!

And also, I fully believe it's big. It's a huge deal! My work is unrecognizable from what it was even 2 years ago. But that's an impact / productivity argument, not an argument about intelligence. Modern programming languages, IDEs, spreadsheets, etc... also made a fundamental shift in what being a software engineer was like, but they were not generally intelligent.

  • > Fun observation - almost every coding harness (claude code, cursor, codex) uses a find/replace tool as the primary way of interacting with code. This requires the agent to fully type out the code it's trying to edit, including several lines of context around the edit. This is really inefficient, token wise! Why does it work this way? Because the LLMs are really bad at counting lines, or using other ways of describing a unique location in the file.

    Incidentally, I saw an interesting article about exactly this subject a little ways back, using line numbers + hashes instead of typing out the full search/replace, writing patches, or doing a DSL, and it seemed to have really good success:

    https://blog.can.ac/2026/02/12/the-harness-problem/