Comment by taylorlunt
7 days ago
These seem like a lot of great ways to work around the limitations of LLMs. But I'm curious what people here think. Do any career software engineers here see more than a 10% boost to their coding productivity with LLMs?
I see how if you can't really code, or you're new to a domain, then it can make a huge difference getting you started, but if you know what you're doing I find you hit a wall pretty quickly trying to get it to actually do stuff. Sometimes things can go smoothly for a while, but you end up having to micromanage the output of the agent too much to bother. Or sacrifice code quality.
They're so nice for prototyping ideas and not becoming attached to the code due to sunken cost. I was playing around with generating intelligent diffs for changelogs for a game. I wasn't sure what approach to highlighting changes I wanted to take without being able to see the results.
Prior to vibe-coding, it would've been an arduous enough task that I would've done one implementation, looked at the time it took me and the output, and decided it was probably good enough. With vibe-coding, I was able to prototype three different approaches which required some heavy lifting that I really didn't want to logic out myself and get a feel for if any of the results were more compelling than others. Then I felt fine throwing away a couple of approaches because I only spent a handful of minutes getting them working rather than a couple of hours.
I agree, prototyping seems like a great use-case.
For stuff that I’m good at? Not even 10%.
For stuff that I’m bad at? Probably more than 1000%. I’ve used it to make a web app, write some shader code, and set up some rtc streaming from unreal engine to the browser. I doubt I would have done them at all otherwise tbh. I just don’t have the energy and interest to conclude that those particular ventures were good uses of my time.
Yeah I couldn't put it better myself. It's obscene how much more productive you become in new domains. And sure, you eventually hit a wall where you gotta understand it for real. But now you have a working example of your project, plus a genius who will answer unlimited questions and clarifications.
And you can do this for anything
> And you can do this for anything
Anything that's been done before. Otherwise we'd probably start with making nuclear fusion work, then head off into the stars...
You've always been able to read books. What you're talking about is skipping the slow learning step and instead generating a mashup of tons of prior art. I don't think it helps you learn. It sounds like it's for things you specifically don't want to learn.
Congrats, you now have a job similar to a factory worker turning a handle every day. Gone is that feeling of growth, that feeling of "getting it" and seeing new realms of possibility in front of you. Now all you can do is beg for more grease on your handle.
1 reply →
Yeah, its like a GPS navigation system. Useless and annoying in home turf. Invaluable in unfamiliar territory.
Maybe it that's an apt analogy in more ways than one, given the recent research out of MIT on AI's impact on the brain, and previous findings about GPS use deteriorating navigation skills:
> The narrative synthesis presented negative associations between GPS use and performance in environmental knowledge and self-reported sense of direction measures and a positive association with wayfinding. When considering quantitative data, results revealed a negative effect of GPS use on environmental knowledge (r = −.18 [95% CI: −.28, −.08]) and sense of direction (r = −.25 [95% CI: −.39, −.12]) and a positive yet not significant effect on wayfinding (r = .07 [95% CI: −.28, .41]).
https://www.sciencedirect.com/science/article/pii/S027249442...
Keeping the analogy going: I'm worried we will soon have a world of developers who need GPS to drive literally anywhere.
1 reply →
I would say I get more (I've been coding 40+ years). I get pretty good results, I find a lot has to do with crafting your prompts well. I think knowing what the outcome should be, technically, makes a big difference. It's getting less and less where I have to argue with the AI / do it myself. Not to mention the amount of little productivity / quality of life scripts I get it to create. They really smooth out a lot of things. I feel like its more heading towards "solution engineering" rather than coding where I'm getting a lot more time to think about the solution and play with different ideas.
My experience is it often generates code that is subtlety incorrect. And I'll waste time debugging it.
But if I give it a code example that was written by humans and ask it to explain the code, it gives pretty good explanations.
It's also good for questions like "I'm trying to accomplish complicated task XYZ that I've never done before, what should I do?", and it will give code samples that get me on the right path.
Or it'll help me debug my code and point out things I've missed.
It's like a pair programmer that's good for bouncing ideas, but I wouldn't trust it to write code unsupervised.
> My experience is it often generates code that is subtlety incorrect. And I'll waste time debugging it.
> […]
> Or it'll help me debug my code and point out things I've missed.
I made both of these statements myself and later wondered why I had never connected them.
In the beginning, I used AI a lot to help me debug my own code, mostly through ChatGPT.
Later, I started using an AI agent that generated code, but it often didn’t work perfectly. I spent a lot of time trying to steer the AI to improve the output. Sometimes it worked, but other times it was just frustrating and felt like a waste of time.
At some point, I combined these two approaches: I cleared the context, told the AI that there was some code that wasn’t working as expected, and asked it to perform a root cause analysis, starting by trying to reproduce the issue. I was very surprised by how much better the agent became at finding and eventually fixing problems when I framed the task from this different perspective.
Now, I have commands in Claude Code for this and other due diligence tasks, and it’s been a long time since I last felt like I was wasting my time.
> My experience is it often generates code that is subtlety incorrect.
Have you isolated if you're properly honing in on the right breadth of context for the planned implementation?
Aah, he must be prompting it wrong
3 replies →
> Do any career software engineers here see more than a 10% boost to their coding productivity with LLMs?
I know it'll be touted as rhetoric but I have seen an order of magnitude of difference in my ability to ship things. Thankfully I don't work for a large enterprise so I don't have a multi-million line codebase to contend with or anything like that. I also, thankfully, ship projects using languages and libs that are very well represented in LLM corpuses, like TypeScript, NextJS, Postgres, though I have also found a lot of success in less popular things like Neo4j's Cypher.
I also have been massively enabled to do lots more 'ops' stuff. Being a pretty average full-stack eng means I have no experience of running sys/ops monitoring systems but LLMs only recently helped me with a bunch of docker-routing issues I was having, teaching me about Traefik, which I'd never heard of before.
Side-point: I have felt so grateful to these LLMs for freeing up a bunch of my brain space, enabling me to think more laterally and not relying so much on my working memory, severely limited now due to historic brain injury. Often people forget how massively enabling these tools can be for disabled people.
I can definitely see the 10% boost being accurate. Keep in mind, its not about doing everything 10% faster, its about being able to put out 10% more results by leveraging agentic coding when it makes sense.
This week I was able to tackle two long-standing bug fixes I've been noodling on and had a rough idea of what I needed to do but had competing priorities and a lack of time to sit down and really internalize the system to figure them out. I brain dumped the issue and my current thoughts and had claude formulate a plan. It solved each in less than 30 minutes of very light effort on my part. I was able to tack these onto larger work I'm doing basically seamlessly.
The other thing that I've found to be an insane benefit is filesystem-backed context switching. If your agentic workflow involves dumping your plan and progress to files in the filesystem, you can pause and restart work at any time by pointing at those files and saying "continue where you last left off". You can even take a `git diff > that-one-bug.patch` of edits made up to that point, copy that alongside the other files, and have a nice-and-neat folder of a unit of work that is ready to pick back up in the future as time permits.
Yes, most days I’m 2x as productive. I’m using Claude Code to produce extremely high quality code that closely follows my coding standards and the architecture of my app.
> Do any career software engineers here see more than a 10% boost to their coding productivity with LLMs?
No, I just put in less effort to arrive at the same point and do no more.
I don’t think people are good at self-reporting the “boost” it gives them.
We need more empirical evidence. And historically we’re really bad at running such studies and they’re usually incredibly expensive. And the people with the money aren’t interested in engineering. They generally have other motives for allowing FUD and hype about productivity to spread.
Personally I don’t see these tools going much further than where they are now. They choke on anything that isn’t a greenfield project and consistently produce unwanted results. I don’t know what magic incantations and combinations of agents people have got set up but if that’s what they call “engineering,” these days I’m not sure that word has any meaning anymore.
Maybe these tools will get there one day but don’t go holding your breath.
> They choke on anything that isn’t a greenfield project and consistently produce unwanted results.
That was true 8 months ago. It's not true today, because of the one-two punch of modern longer-context "reasoning" models (Claude 4+, GPT-5+) and terminal-based coding agents (Claude Code, Codex CLI).
Setting those loose an an existing large project is a very different experience from previous LLM tools.
I've watched Claude Code use grep to find potential candidates for a change I want to make, then read the related code, follow back the chain of function calls, track down the relevant tests, make a quick detour to fetch the source code of a dependency directly from GitHub (by guessing the URL to the raw file) in order to confirm a detail, make the change, test the change with an ad-hoc "python -c ..." script, add a new automated test, run the tests and declare victory.
That's a different class entirely from what GPT-4o was able to do.
I think the thing people have to understand is how fast the value proposition is changing. There is a lot of conversation about "plateauing" model performance, but the actual experience from the combination of the model and tooling changes is night and day in the last 3 months. It was beginning to be very useful with Claude 3.7 in the spring this year, but we have just gone through a step function change.
I was decomissioning some code and I made the mistake of asking for an "exhaustive" analysis of the areas I needed to remove. Sonnet 4.5 took 30 minutes looking around and compiling a detailed report on exactly what needed to be removed from this very very brownfield project and after I reviewed the report, it one shot the decommisioning of the code (in this case I was using CLaude in the Cursor tooling at work). It was overkill, but impressive how well it mapped all the ramifications in the code base by greping around.
Indeed, Codex CLI is quite useful even for demanding tasks. The current problem is that it might gather context for 20 minutes before doing the actual thing. The question is whether this will be sped up significantly.
I guess we just have to take your word for this, which is somewhat odd considering most of your comments link back to some artifact of yours. Are you paid by any of these companies?
5 replies →
All I've found is the LLM just makes me work more. It's hard to talk about % boost when you're just simply working more hours.
It's like having a faster car with a bigger engine. Big deal. I want a faster car with a smaller engine. My ideal is to actually go home and stop working at the end of the day.
I also don't want to use it for my day job because I'm afraid my brain will atrophy. You don't really need to think when something is already done for you. I don't want to become someone who can only join together LLM output. I don't feel like I'll miss out on anything by not jumping on now, but I do feel like I'll lose something.
At this point I'd say that I'm 1000% more productive in the aspects that I use it for. I rarely hit any walls, and if I do its absolutely always down to an unclear or incomplete thought progress or lack of clarity in prompting.
There's a lot of annoying stuff it can do fairly well without many guardrails. It's a minor productivity boost but it's nice not to have to do.
Doc comments for example. Today I had it generate doc comments for a class I wrote. I had to go and fix every single one of them because it did some dumb shit, but it out all the scaffolding in place and got the basics there so it was a lot quicker.
I also used it to generate json schemas from Python a couple of Python classes the other day. Highly structured inputs, highly structured output, so there wasn't much for it to fuck up. Took care of the annoying busy work I didn't want to do (not that schemas are busy work, but this particular case was).
Still haven't seen a use case that justifies the massive cost, or all the blatant theft and copy right infringement, or the damage to the environment...
LLMs have been useful for years now and people still say stuff like “but is it really useful or are all these brilliant people just deluded”…