Comment by thenaturalist
1 day ago
While I tend to agree with your premise that the linked article seems to be reasoning to the extreme off the basis of a very small code snippet, I think the core critique the author wants to make stands:
AI agents alone, unbounded, currently cannot provide huge value.
> try asking cursor for potential refactors or patterns that could be useful for a given text.
You, the developer, will be selecting this text.
> try giving them an abstract jira ticket or asking what it things about certain naming, with enough context
You still selected a JIRA ticket and provided context.
> ask any engineer that saves time with everything from test scaffolding to run-and-forget scripts.
Yes that is true, but again, what you are providing as a counterfactual are very bounded, aka easy contexts.
In any case, the industry (both the LLM providers as well as tooling builders and devs) is clearly going into the direction of constantly etching out small imoprovements by refining which context is deemed relevant for a given problem and most efficient ways to feed it to LLMs.
And let's not kid ourselves, Microsoft, OpenAI, hell Anthropic all have 2027-2029 plans where these things will be significantly more powerful.
Here's an experience I've had with Claude Code several times:
1. I'll tell Claude Code to fix a bug.
2. Claude Code will fail, and after a few rounds of explaining the error and asking it to try again, I'll conclude this issue is outside the AI's ability to handle, and resign myself to fixing it the old fashioned way.
3. I'll start actually looking into the bug on my own, and develop a slightly deeper understanding of the problem on a technical level. I still don't understand every layer to the point where I could easily code a solution.
4. I'll once again ask Claude Code to fix the bug, this time including the little bit I learned in #3. Claude Code succeeds in one round.
I'd thought I'd discovered a limit to what the AI could do, but just the smallest bit of digging was enough to un-stick the AI, and I still didn't have to actually write the code myself.
(Note that I'm not a professional programmer and all of this is happening on hobby projects.)
> I once again ask Claude Code to fix the bug, this time including the little bit I learned in #3. Claude Code fixes the problem in one round.
Context is king, which makes sense since LLM output is based on probability. The more context you can provide it, the more aligned the output will be. It's not like it magically learned something new. Depending on the problem, you may have to explain exactly what you want. If the problem is well understood, a sentence would most likely be suffice.
>If the problem is well understood, a sentence would most likely be suffice.
I feel this falls flat for the rather well-bounded use case I really want: a universal IDE that can set up my environment with a buildable/runnable boilerplate "hello world" for arbitrary project targets. I tried vibe coding an NES 6502 "hello world" program with Cursor and it took way more steps (and missteps) than me finding an existing project on GitHub and cloning that.
Absolutely! What surprises me is how rarely I actually have to get all the way down to writing the code myself.
I had Claude go into a loop because I have cat aliased as bat
It wanted to check a config json file, noticed that it had missing commas between items (because bat prettifies the json) and went into a forever loop of changing the json to add the commas (that were already there) and checking the result by 'cat'ing the file (but actually with bat) and again finding out they weren't there. GOTO 10
The actual issue was that Claude had left two overlapping configuration parsing methods in the code. One with Viper (The correct one) and one 1000% idiotic string search system it decided to use instead of actually unmarshaling the JSON :)
I had to use pretty explicit language to get it stop fucking with the config file and look for the issue elsewhere. It did remember it, but forgot on the next task of course. I should've added the fact to the rule file.
(This was a vibe coding experiment, I was being purposefully obtuse about not understanding the code)
Why does it matter that you're doing the thinking? Isn't that good news? What we're not doing any more is any the rote recitation that takes up most of the day when building stuff.
I think "AI as a dumb agent for speeding up code editing" is kind of a different angle and not the one I wrote the article to address.
But, if it's editing that's taking most of your time, what part of your workflow are you spending the most time in? If you're typing at 60WPM for an hour then that's over 300 lines of code in an hour without any copy and paste which is pretty solid output if it's all correct.
But that’s just it: 300 good lines of reasonably complex working code in an hour vs o4-mini can churn out 600 lines of perfectly compilable code in less than 2 minutes, including the time it takes me to assemble the context with a tool such as repomix (run locally) or pulling markdown docs with Jina Reader.
The reality is, we humans just moved one level up the chain. We will continue to move up until there isn’t anywhere for us to go.
5 replies →
In lots of jobs, the person doing work is not the one selecting text or the JIRA ticket. There's lots of "this is what you're working on next" coding positions that are fully managed.
But even if we ignored those, this feels like goalpost moving. They're not selecting the text - ok, ask LLM what needs refactoring and why. They're not selecting the JIRA ticket with context? Ok, provide MCP to JIRA, git and comms and ask it to select a ticket, then iterate on context until it's solvable. Going with "but someone else does the step above" applies to almost everyone's job as well.
Could you explain what you mean by etching out small improvements? I've never seen the phrase "etching out" before.
Not OP, but might be an autocorrection for "eking out"
I think maybe you have unrealistic expectations.
Yesterday I needed to import a 1GB CSV into ClickHouse. I copied the first 500 lines into Claude and asked it for a CREATE TABLE and CLI to import the file. Previous day I was running into a bug with some throw-away code so I pasted the error and code into Claude and it found the non-obvious mistake instantly. Week prior it saved me hours converting some early prototype code from React to Vue.
I do this probably half a dozen times a day, maybe more if I'm working on something unfamiliar. It saves at a minimum an hour a day by pointing me in the right direction - an answer I would have reached myself, but slower.
Over a month, a quarter, a year... this adds up. I don't need "big wins" from my LLM to feel happy and productive with the many little wins it's giving me today. And this is the worst it's ever going to be.