He freely admits that the LLM did his job way faster than he could, but then claims that he doesnt believe it could make him 10x more productive. He decides that he will not use his new "superpower" because the second prompt he sent revealed that the code had security issues, which the LLM presumably also fixed after finding them. The fact that the LLM didnt consider those issues when writing his code puts his mind at rest about the possibility of being replaced by the LLM. Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly? Considering his "personal stance on ai" I think he was going into this experience expecting exactly the result he got to reinforce his beliefs. Unironically enough thats exactly the type of person who would get replaced, because as a developer if youre not using these tools youre staying behind
> Did he consider that the LLM would've done it the right way after the first message if prompted correctly?
This is an argument used constantly by AI advocates, and it's really not as strong as they seem to think.*
Yes, there exists some prompt that produces the desired output. Reductio ad absurdum, you can just prompt the desired code and tell it to change nothing.
Maybe there is some boilerplate prompt that will tell the LLM to look for security, usability, accessibility, legal, style, etc. issues and fix them. But you still have to review the code to be sure that it followed everything and made the correct tradeoff, and that means that you, the human, has to understand the code and have the discernment to identify flaws and adjust the prompt or rework the code in steps.
It's precisely that discernment that the author lacks for certain areas and which no "better" prompting will obviate. Unless you can be sure that LLMs always produce the best output for a given prompt, and the given prompt is the best it can be, you will still need a discerning human reviewer.
* Followed closely by: "Oh, that prompt produced bad results 2 weeks ago? AI moves fast, I'm sure it's already much better now, try again! The newest models are much more capable."
It's reasonable to expect people to know how to use their tools well.
If you know how to set up and sharpen a hand plane and you use them day in and day out, then I will listen to your opinion on a particular model of plane.
If you've never used one before and you write a blog post about running into the same issues every beginner runs into with planes then I'm going to discount your opinion that they aren't useful.
Eeeh, the LLM wouldn't have done it correctly, though. I use LLMs exclusively for programming these days, and you really need to tell them the architecture and how to implement the features, and then review the output, otherwise it'll be wrong.
They are like an overeager junior, they know how to write the code but they don't know how to architect the systems or to avoid bugs. Just today I suspected something, asked the LLM to critique its own code, paying attention to X Y Z things, and it found a bunch of unused code and other brittleness. It fixed it, with my guidance, but yeah, you can't let your guard down.
Of course, as you say, these are the tools of the trade now, and we'll have to adapt, but they aren't a silver bullet.
I use (and like) AI, but “you failed the AI by not prompting correctly” strikes me as silly every time I hear it. It reminds me of the meme about programming drones where the conditional statement “if (aboutToCrash)” is followed by the block “dont()”.
What I have come to understand is that it will do exactly what you tell it to do and it usually works well if you give it the right context and proper constraints, but never forget that it is essentially just a very smart autocomplete.
> Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly?
I think the article is implicitly saying that an LLM that's skilled enough to write good code should have done it "the right way" without extra prompting. If LLMs can't write good code without human architects guiding it, then I doubt we'll ever reach the "10x productivity" claims of LLM proponents.
I've also fell into the same trap of the author in assuming that because an LLM works well when guided to do some specific task, that it will also do well writing a whole system from scratch or doing some large reorganization of a codebase. It never goes well, and I end up wasting hours arguing with an LLM instead of actually thinking about a good solution and then implementing it.
> which the LLM presumably also fixed after finding them
In my experience: not always, and my juniors aren't experienced enough to catch it, and the LLM at this point doesn't "learn" from our usage properly (and we've not managed to engineer a prompt good enough to solve it yet), so its a recurring problem.
> if prompted correctly
At some point this becomes "draw the rest of the owl" for me, this is a non-trivial task at scale and with the quality bar required, at least with the latest tools. Perhaps it will change.
Exactly. I have all sorts of personal feelings about "AI" (I don't call it that, whatever) but spending a few days with Claude Code made it clear to me that we're in a new era.
It's not going to replace me, it's going to allow me to get projects done that I've backburnered for years. Under my direction. With my strict guidance and strict review. And that direction and review requires skill -- higher level skills.
Yes, if you let the machine loose without guidance... you'll get garbage-in, garbage-out.
For years I preferred to do ... immanent design... rather than up front design in the form of docs. Now I write up design docs, and then get the LLM to aid in the implementation.
You are the last barrier between the generated code and production. It would be silly to trust the LLM output blindly and not deeply think about how it could be wrong.
The METR study that you're likely talking about had a lot of nuances that don't get talked about, not to mention outright concerns e.g. this one participant revealed he had a pretty damning selection bias:
There's a dissonance I feel. The study for example looked at experienced developer working on existing open source projects.
Lots of people we're now conversing with could be junior or mid-level, might have tried it for little prototypes/experiments, or for more trivial software like commissioned websites, and so on. They could all be benefiting from agentic coding workflows in ways that we don't. With the caveat that the study you talked about also showed even the experience devs felt more productive, so clearly the use of AI biases your perception of delivery speed.
The large array of context I suspect is responsible for some of that dissonance on online discourse.
Here's my blog with a work in progress article [0], written in a DSL I wrote called Web Pipe [1] that I started four days ago [2] with probably about 12 hours worth of work:
You don’t need to bathe in a stupid practice 24/7 to determine it is a stupid practice. He could see where it was going.
Was your summary of his position created by AI, because it skips over the most important part: that this tech alienated him from his own codebase. It’s doing the same thing to you. The difference is you don’t give a shit.
AI an amazing productivity boost only assuming you don’t give a shit.
It’s a shame that AI companies don’t share examples of their training data. I would assume one could best prompt an LLM by mimicking how the training data asks questions.
> Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly?
Did you consider that Scrum for the Enterprise (SAFe) when used correctly (only I know how, buy my book), solves all your company's problems and writes all your features for free. If your experience with my version of SAFe fails, it's a skill issue on your end. That's how you sound.
If your LLMs which you are so ardently defending, are so good, where are the results in open source?
I can tell you where, open source maintainers are drowning in slop that LLM enthusiasts are creating. Here is the creator of curl telling us what he thinks of AI contributions.https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s... Now I have the choice: should I believe the creator of curl, or the experience of a random LLM fanboy on the internet?
If your LLMs are so good, why does it require a rain dance and a whole pseudoscience how to configure them to be good? You know what, in the only actual study with experienced developers to date, using LLMs actually resulted in 19% decrease in productivity. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... Have you considered that maybe if you are experiencing gains from LLMs but a study shows experienced devs don't, that maybe instead of them having a skills issue, it's you? Cause the study showed experienced devs don't benefit from LLMs. What does it make you?
I'll admit I'm probably not as good at programming as the creator of curl. I write SaaS CRUD apps as a solo dev in a small business for a living. LLMs took away the toil of writing react and I appreciate that.
I have been using LLMs for coding for the past few months.
After initial hesitation and fighting the the LLMs, I slowly changed my mode from adversarial to "it's a useful tool". And now I find that I spend less time thinking about the low-level stuff (shared pointers, move semantics, etc. etc.) and more time thinking about the higher-level details. It's been a bit liberating, to be honest.
I like it now. It is a tool, use it like a tool. Don't think of "super intelligence", blah blah. Just use it as a tool.
My experience using LLMs is similar to my experience working with a team of junior developers. And LLMs are valuable in a similar way.
There are many problems where the solution would take me a few hours to derive from scratch myself, but looking at a solution and deciding “this is correct” or “this is incorrect” takes a few minutes or seconds.
So I don’t expect the junior or the LLM to produce a correct result every time, but it’s quick to verify the solution and provide feedback, thus I have saved time to think about more challenging problems where my experience and domain knowledge is more valuable.
A problem I'm seeing more and more in my code reviews is velocity being favored over correctness.
I recently had a team member submit code done primarily by an LLM that was clearly wrong. Rather than verifying that the change was correct, they rapid fired a cr and left it up to the team to spot problems.
They've since pushed multiple changes to fix the initial garbage of the LLM because they've adopted "move fast and break things". The appearance of progress without the substance.
It's a large enough team and there are members that rubber stamp everything.
Takes just a lunch break for the review to go up and get approved by someone that just made sure there's code there. (Who is also primarily using LLMs without verifying)
So, if the study showed experienced developers had a decline in productivity, and some developers claim gains in theirs, there is high chance that the people reporting the gains are...less experienced developers.
See, some claim that we are not using LLMs right (skills issue on our part) and that's why we are not getting the gains they do, but maybe it's the other way around: they are getting gains from LLMs because they are not experienced developers (skills issue on their part).
I'll wait for more studies about productivity, one data point is not solid foundation, there are a lot of people who want this to be true, and the models and agent systems are still getting better
I'm an experience (20y) developer and these tools have saved me many hours on a regular basis, easily covering the monthly costs many times over
Your comments are citing this blog post and arxiv preprint.
You are also misrepresenting the literature. There are many papers about LLMs and productivity. You can find them on Google Scholar and elsewhere.
The evidence is clear that LLMs make people more productive. Your one cherry picked preprint will get included in future review papers if it gets published.
> So, if the study showed experienced developers had a decline in productivity,
You forgot to add: first time users, and within their comfort zone. Because it would be completely different result if they were experienced with AI or outside of their usual domain.
What were you using? Did you use it for a real project? I ask because you're going to have a vastly different experience with Cursor than with Claude Code, for example.
My work has offered us various tools. Copilot, Claude, Cursor, ChatGPT. All of them had the same behavior for me. They would produce some code that looks like it would work but hallucinate a lot of things like what parameters a function takes or what libraries to import for functionality.
In the end, every tool I tried felt like I was spending a significant amount of time saying “no that won’t work” just to get a piece of code that would build, let alone fit for the task. There was never an instance where it took less time or produced a better solution than just building it myself, with the added bonus that building it myself meant I understood it better.
In addition to that I got into this line of work because I like solving problems. So even if it was as fast and as reliable as me I’ve changed my job from problem solver to manager, which is not a trade I would make.
A lot of people is using the tool in a wrong way. It’s massively powerful, there’s a lot of promisses but it’s not magic. The tool works on words and statistics. Better be really thoughtful and precise beforehand.
No one notices that Cursor or Claude code is not asking questions to clarify. It’s just diving right in. We humans ask ourselves a lot of questions before diving in so when we do it’s really precise.
When we use CC with a really great level of precision on a well defined context the probability of answering right goes up. That’s the new job we have with this tool.
I think one of the reasons "coding with AI" conversations can feel so unproductive, or at least vague, to me, is that people aren't talking about the same thing. For some, it means "vibe coding" ... tossing quick prompts into something like Cursor, banging out snippets, and hoping it runs. For others, it's using AI like a rubber duck: explaining problems, asking clarifying questions, maybe pasting in a few snippets. And then there's the more involved mode, where you're having a sustained back-and-forth with multiple iterations and refinements. Without recognizing those distinctions, the debate tends to talk past itself.
For me, anything that feels like anything remotely resembling a "superpower" with AI starts with doing a lot of heavy lifting upfront. I spend significant time preparing the right context, feeding it to the model with care, and asking very targeted questions. I'll bounce ideas back and forth until we've landed on a clear approach. Then I'll tell the model exactly how I want the code structured, and use it to extend that pattern into new functionality. In that mode, I'm still the one initializing the design and owning the understanding...AI just accelerates the repetitive work.
In the end, I think the most productive mindset is to treat your prompt as the main artifact of value, the same way source code is the real asset and a compiled binary is just a byproduct. A prompt that works reliably requires a high degree of rigor and precision -- the kind of thinking we should be doing anyway, even without AI. Measure twice, cut once.
If you start lazy, yes...AI will only make you lazier. If you start with discipline and clarity, it can amplify you. Which I think are traits that you want to have when you're doing software development even if you're not using AI.
Just yesterday I gave gemini code a git worktree of the system I'm building at work. (Corp approved yadda yadda).
Can't remember the prompt was "evaluate the codebase and suggest any improvements. specifically on the <nameofsystem> system"
Then I tabbed out and did other stuff
Came back a bit later, checked out its ramblings. It misunderstood the whole system completely and tried to add a recursive system that wasn't even close to what it was supposed to be.
BUT it had detected an error message that just provided an index where parsing failed on some user input like "error at index (10)", which is completely useless for humans. But that's what the parser library gives us, so it's been there for a while.
It suggested a function that grabs the input, modifies it with a marker at the index given by the error message and shows clearly which bit in the input was wrong.
Could I have done this myself? Yes.
Would I have bothered, no I have actual features to add at this point.
Was it useful? Definitely. There was maybe 5 minutes of active work on my part and I got a nice improvement out of it.
And this wasn't the only instance.
Even the misunderstanding could've been avoided by me providing the agent better documentation on what everything does and where they are located.
I really think people are approaching LLMs wrong when it comes to code. Just directing an agent to make you something you’re unfamiliar with is always going to end up with this. It’s much better to have a few hours chat with the LLM and learn some about the topic, multiple times over many days, and then start.
And ask questions and read all the code and modify it yourself; and read the compile errors and try to fix then yourself; etc. Come back to the LLM when you’re stuck.
Having the machine just build you something from a two sentence prompt is lazy and you’ll feel lazy and bad.
Learn with it and improve with it. You’ll end up with more knowledge and a code base for a project that you do (at least somewhat) understand, and you’ll have a project that you wouldn’t have attempted otherwise.
The problem is not in making something you're unfamiliar with. The problem is doing something that your familiar with, trying out an LLM to see if it can assist you, then you are kind of impressed for the first few prompts so you let it off the leash and suddently you find yourself in a convoluted codebase you would never write that way with so many weird often nonsensical things different to how you normally approach them (or any sane person would) so that you can basically throw it all in the trash. The only way this can be avoided is by diligently checking every single diff the LLM makes. but let's be honest, its just so damn inviting to let it off the leash for a moment.
I think the LLM accounting benchmark is a good analogy. The first few prompts are like the first month in accounting. the books are correct before so the LLM has a good start. in the accounting benchmark then the miscalculations compound as do the terrible practices in the codebase.
I recently had the following experience. I vibe-coded something in a language I'm not super familiar with, it seemed correct, it type-checked. Tests passed. Then reviewer pointed many stylistic issues and was rightfully pissed at me. When I addressed the comments, I realized I would not have made those mistakes had i written the code myself. It was a waste of time for me and the reviewer.
Another thing that happens quite often. I give the task to the LLM. It's not quite what I want. I fix the prompt. Still not there. Every iteration takes time, in which I lose my focus because it can take minutes. Sometimes it's very frustrating, I feel I'm not using my brain, not learning the project. Again, loss of time.
At the current stage, if I want to be productive, I need to restrict the use of the LLMs to the tasks for which there's a high change that it'll get it right in the first try. Out of laziness, I still have the tendency to give it some more complex tasks and ultimately lose time.
> "When I tried to fix the security issues, I quickly realized how this whole thing was a trap. Since I didn't wrote it, I didn't have a good bird's eye view of the code and what it did. I couldn't make changes quickly, which started to frustrated me. The easiest route was asking the LLM to do the fixes for me, so I did. More code was changed and added. It worked, but again I could not tell if it was good or not."
Maintaining your own list of issues to look for and how to resolve them, or prevent them outright is almost mandatory, and also doubles as a handy field reference guide for what gaps exist in applying LLM's to your particular use when someone higher up asks.
Very well said. Just because AI can churn out a usable code project as fast as it takes for my cup garri to soak(3 mins) doesn't mean it should be used that way.
It takes mastery, just like with actual programming syntax. There are many ways to achieve a business objective. Choosing the right one for the specific use case and expected outcome takes iterations.
AI HAS replaced whole niche markets and it's just the beginning. The best any dev can do in this context is sharpen their use of it. That becomes a superpower; well defined context and one's own good grasp of the tech stack being worked on.
Context: I still lookup rust docs and even prompt for summaries and bullet point facts about rust idioms for better understanding of the code. I identify as a JS dev first but, currently learning rust as I work on a passion project.
Agreed, I only think people get lazy with AI if they let the AI do the thinking for them, rather than using it as a thinking machine to push their level of thinking wherever they are.
In a way, LLMs are a fuzzy semantic relation engine and you can almost get to the point of running many forecasts or scenarios of how to solve a problem, or it could be solved, long before daring to write a user story or spec for how to write it.
The issue with industrial software development is it's incremental at best, and setting those vectors from the start of a project can benefit from a more accurate approach that reflects how projects often really start, instead of trying to 1-10 shot things, which can be fun, but not always sustainable.
It can be very benefical I find to require AI to explain and teach you at al times so it's keeping the line of thought and "reasoning" aligned.
It’s on us as developers to use LLMs thoughtfully. They’re great accelerators, but they can also atrophy your skills if you outsource the thinking. I try to keep a deliberate balance: sometimes I switch autocomplete off and design from scratch; I keep Anki for fundamentals; I run micro‑katas to maintain muscle memory; and I have a “learning” VS Code profile that disables LLMs and autocomplete entirely.
As unfashionable as it sounds on hacker news, a hybrid workflow and some adaptation are necessary. In my case, the LLM boom actually pushed me to start a CS master’s (I’m a self‑taught dev with 10+ years of experience after a sociology BA) and dive into lower‑level topics, QMK flashing, homelab, discrete math. LLMs made it easier to digest new material and kept me curious. I was a bit burned out before; now I enjoy programming again and I’m filling gaps I didn’t know I had.
I often get down-voted on Hacker News for my take on AI, but maybe I am just one of the few exceptions who get a lot out of LLMs
This is the best compromise for coding with LLMs I’ve seen!
On an old sitcom a teenager decides to cheat on an exam by writing all the answers on the soles of his shoes. But he accidentally learns the material through the act of writing it out.
Similarly, the human will maintain a grasp of all the code and catch issues early, by the mechanical act of typing it all out.
You just reminded me about my math teacher letting us use programs on our TI-83 to cheat on the test if we could type the programs in ourselves. Definitely worked
Congratulations, you tried AI and you immediately noticed all the same limitations that everyone else notices. No-one is claiming the technology's perfect.
How many more times is someone going to write this same article?
In the past few months I have used AI to read more open source projects than I ever had. Tackled projects in Rust that I was too intermediated to start. AI doesn't make you lazy.
AI can churn out usable code faster than it takes for my cup garri to soak(2-3 mins) doesn't mean it should be used that way.
Software and technology takes mastery; imagine the string manipulation syntax for different programming languages. There are many ways to achieve a business objective. Choosing the right language/coding style for the specific use case and expected outcome takes iterations and planning.
AI still in infancy yet it has replaced and disrupted whole niche markets and it's just the beginning. The best any dev can do in this context is sharpen their use of it and that becomes a superpower; well defined context and one's own good grasp of the tech stack being worked on.
Context:
I still lookup rust docs and even prompt for summaries and bullet point facts about rust idioms/concepts that I am yet to internalize. JS is what I primarily write code in but, currently learning rust as I work on a passion project.
I'm already lazy and getting progressively stupider over time, so LLMs can't make me any worse.
I also think it's a matter of how one uses them. I do not use any of the LLMs via their direct APIs. I do not want LLMs in any of my editors. So, when I go to ask questions in with web app, it takes a bit more friction. I'm honestly an average at best programmer, and I do not really need LLMs for much. I mainly use LLMs to just ask trivial questions that I could have googled. However, LLMs are not rife with SEO ads and click-bait articles (yet).
I came to the same conclusion two days after using Claude.
It's extremely easy to build heaps of absolutely vile code with this thing, but I can see it accelerating senior/staff/staff+ devs who knows what they are doing in certain situations.
In my case, it built an entire distributed system for me, with APIs, servers and tests, but on my goodness the compile errors were unending and Claude does crazy stuff to reconcile them sometimes.
I find it interesting how many people complain that AI produces code that mostly works but overlooks something, or that it was able to generate something workable but wasn't perfect and didn't catch every single thing on it's first try.
For fucks sake it probably got you to the same first draft you would have gotten to yourself in 10x less time. In fact there plenty of times where it probably writes a better first draft than you would have. Then you can iterate from there, review and scrutinize it just as much as you should be doing with your own code.
Last I checked the majority of us don't one shot the code we write either. We write it once, then iterate on things we might have missed. As you get better you prompt instinctively to include those same edge cases you would have missed when you were less experienced.
Everybody has this delusion that your productivity comes from the AI writing perfect code from step 1. No, do the same software process you normally should be doing, but get to the in between steps many times faster.
The benefit of writing your own first draft is the same reason why you take notes during classes or rephrase things in your head. You're building up a mental model of the codebase as you write it, so even if the first draft isn't great you know where the pieces are, what they should be doing and why they should be doing it. The cognitive benefits of writing notes is well known.
If you're using an AI to produce code you're not building up any model at all. You have to treat it as an adversarial process and heavily scrutinize/review the code it outputs, but more importantly it's code you didn't write and map. You might've wrote an extensive prompt detailing what you want to happen, but you don't know if it did happen or not.
You should start asking yourself how well you know the codebase and where the pieces are and what they do.
It really depends on the scale of the code you are asking it to produce. The sweet spot for me with current models is about 100-200 lines of code. At that scale I can prompt it to create a function and review and understand it much faster than doing it by hand. Basically using it as super super autocomplete, which may very well be underutilizing it, but at that scale, I am definitely more productive but still feel ownership of the end result.
This has been my experience as well. It removes a major bottleneck between my brain and the keyboard and gets that first pass done really quickly. Sometimes I take it from there completely and sometimes I work with the LLM to iterate a few rounds to refine the design. I still do manual verification and any polish/cleanup as needed, which so far, has always been needed. But it no doubt saves me time and takes care of much of the drudgery that I don’t care for anyway
If you use it for the right thing it's great, but if you fall into the trap of trying to make everything for you, you will for sure become super lazy.
The reality is, the more power these tools have, the bigger the responsibility you have to stay sharp and understand the generated code. If that's what you're interested in of course
I usually only use AI for things that I previously didn't do at all (like UI development). I don't think its making me lazy or stupid.
I'm sure it's writing lazy stupid JavaScript, but the alternative is that my users got a CLI. Given that alternative, I think they don't mind the stupid JavaScript.
I’d definitely be wary of vibe coding anything that is internet facing. But at same time there has to be some middle ground here too - bit of productivity gains without any significant tangible downside. Even if that middle ground is just glorified auto complete
I see LLMs as a force multiplier. It's not going to write entire projects for me, but it'll assist with "how do i do x with y" kind of problems. At the end of the day I still understand the codebase and know where its faults lie.
LLMs are the ultimate leetcode solvers because they're stellar at small, atomic, functional bits of code that, importantly, have already been done and written about before.
But at this point in time they are terrible about reasoning about novel or overly complex problems.
It is deliciously ironic that LLMs can absolutely crush the software engineering interview process, but frankly it's no surprise to those of us that have been complaining that whiteboard coding interviews are a bullshit way to hire senior software engineers.
I look forward to watching the industry have to abandon it's long held preference for an interview metric that has no predictive capability for software engineering success.
I'm getting very tired of these 'either extreme' articles (produced, I expect, for clicks).
The reality is today:
- if you don't use AI coding at all you'll be left behind. No one writes HTML by hand, AI just means one less thing by hand
- doing "hey AI give me an app" absolutely is garbage, see the Tea app debacle
- relying on AI and not understanding what it's doing will absolutely degrade your skill and make a dev lazy, don't ask it for a bunch of Web endpoints and hope they're scalable and secure
Now will AI be able to make an entire app perfectly one day? Who knows.
AI will cause senior developers to become 10 times more effective.
AI will cause junior developers to become 10 times less effective. And that's when taking into account the lost productivity of the senior developers who need to review their code.
Unfortunately for the writer, he will probably get fired because of AI. But not because AI will replace him - because seniors will.
Very brave to have an edgy opinion based on vibes that is counter to the only actual study though. Claiming that things are XYZ just because it feels like it is all the rage nowadays, good for being with the times.
> Unfortunately for the writer, he will probably get fired because of AI. But not because AI will replace him - because seniors will
Here is another prediction for you. In the current real world LLMs create mountains of barely working slop on a clean project, and slowly pollute it with trash feature after feature. The LGTM senior developers will just keep merging and merging until the project becomes such a tangled mess that the LLM takes billion tokens to fix it or it outright can't, and these so called senior developers had their skills deteriorate to such an extent that they'd need to call the author of the article to save them from the mess they created with their fancy autocomplete.
I am too stupid and old to code up to the standards of my younger days. AI allows me to get my youth back. I learned so many new things since Sonnet 4 came out in May. I doubted AI too until Sonnet 4 surprised me with my first AGI moment.
It's a reasonable take from the author, but the argument that you shouldn't use a tool you don't understand cuts both ways. Avoiding powerful tools can be just as much of a trap as using them blindly.
Like any tool, there's a right and wrong time to use an LLM. The best approach is to use it to go faster at things you already understand and use it as an aid to learn things you don't but don't blindly trust it. You still need to review the code carefully because you're ultimately responsible for it, your name is forever on it. You can't blame an LLM when your code took down production, you shipped it.
It’s a double-edged sword: you can get things done faster, but it's easy to become over-reliant, lazy, and overestimate your skills. That's how you get left behind.
The old advice has never been more relevant: "stay hungry."
He freely admits that the LLM did his job way faster than he could, but then claims that he doesnt believe it could make him 10x more productive. He decides that he will not use his new "superpower" because the second prompt he sent revealed that the code had security issues, which the LLM presumably also fixed after finding them. The fact that the LLM didnt consider those issues when writing his code puts his mind at rest about the possibility of being replaced by the LLM. Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly? Considering his "personal stance on ai" I think he was going into this experience expecting exactly the result he got to reinforce his beliefs. Unironically enough thats exactly the type of person who would get replaced, because as a developer if youre not using these tools youre staying behind
> Did he consider that the LLM would've done it the right way after the first message if prompted correctly?
This is an argument used constantly by AI advocates, and it's really not as strong as they seem to think.*
Yes, there exists some prompt that produces the desired output. Reductio ad absurdum, you can just prompt the desired code and tell it to change nothing.
Maybe there is some boilerplate prompt that will tell the LLM to look for security, usability, accessibility, legal, style, etc. issues and fix them. But you still have to review the code to be sure that it followed everything and made the correct tradeoff, and that means that you, the human, has to understand the code and have the discernment to identify flaws and adjust the prompt or rework the code in steps.
It's precisely that discernment that the author lacks for certain areas and which no "better" prompting will obviate. Unless you can be sure that LLMs always produce the best output for a given prompt, and the given prompt is the best it can be, you will still need a discerning human reviewer.
* Followed closely by: "Oh, that prompt produced bad results 2 weeks ago? AI moves fast, I'm sure it's already much better now, try again! The newest models are much more capable."
It's reasonable to expect people to know how to use their tools well.
If you know how to set up and sharpen a hand plane and you use them day in and day out, then I will listen to your opinion on a particular model of plane.
If you've never used one before and you write a blog post about running into the same issues every beginner runs into with planes then I'm going to discount your opinion that they aren't useful.
7 replies →
Eeeh, the LLM wouldn't have done it correctly, though. I use LLMs exclusively for programming these days, and you really need to tell them the architecture and how to implement the features, and then review the output, otherwise it'll be wrong.
They are like an overeager junior, they know how to write the code but they don't know how to architect the systems or to avoid bugs. Just today I suspected something, asked the LLM to critique its own code, paying attention to X Y Z things, and it found a bunch of unused code and other brittleness. It fixed it, with my guidance, but yeah, you can't let your guard down.
Of course, as you say, these are the tools of the trade now, and we'll have to adapt, but they aren't a silver bullet.
> you can't let your guard down.
This is a nice way of putting it. And when the guard is tested or breached it’s time to add that item to the context files.
In that way, you are coding how you want coding to code.
> I use LLMs exclusively for programming these days
Meaning you no longer write any code directly, or that you no longer use LLMs other than for coding tasks?
8 replies →
I use (and like) AI, but “you failed the AI by not prompting correctly” strikes me as silly every time I hear it. It reminds me of the meme about programming drones where the conditional statement “if (aboutToCrash)” is followed by the block “dont()”.
At the same time, prompt/context engineering makes them better, so it matters more than zero
2 replies →
What I have come to understand is that it will do exactly what you tell it to do and it usually works well if you give it the right context and proper constraints, but never forget that it is essentially just a very smart autocomplete.
3 replies →
It’s not the ai, you’re using it wrong. /s
> Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly?
I think the article is implicitly saying that an LLM that's skilled enough to write good code should have done it "the right way" without extra prompting. If LLMs can't write good code without human architects guiding it, then I doubt we'll ever reach the "10x productivity" claims of LLM proponents.
I've also fell into the same trap of the author in assuming that because an LLM works well when guided to do some specific task, that it will also do well writing a whole system from scratch or doing some large reorganization of a codebase. It never goes well, and I end up wasting hours arguing with an LLM instead of actually thinking about a good solution and then implementing it.
> I end up wasting hours arguing with an LLM
Don’t do this! Start another prompt!
> which the LLM presumably also fixed after finding them
In my experience: not always, and my juniors aren't experienced enough to catch it, and the LLM at this point doesn't "learn" from our usage properly (and we've not managed to engineer a prompt good enough to solve it yet), so its a recurring problem.
> if prompted correctly
At some point this becomes "draw the rest of the owl" for me, this is a non-trivial task at scale and with the quality bar required, at least with the latest tools. Perhaps it will change.
We're still using them, they still have value.
> as a developer if youre not using these tools youre staying behind
Well that's certainly a belief. Why are you not applying your lofty analysis to your own bias?
He made the cardinal AI mistake: getting AI to get a job you cant do yourself. AI is great to speed you up, but you cant trust it to think for you.
Exactly. I have all sorts of personal feelings about "AI" (I don't call it that, whatever) but spending a few days with Claude Code made it clear to me that we're in a new era.
It's not going to replace me, it's going to allow me to get projects done that I've backburnered for years. Under my direction. With my strict guidance and strict review. And that direction and review requires skill -- higher level skills.
Yes, if you let the machine loose without guidance... you'll get garbage-in, garbage-out.
For years I preferred to do ... immanent design... rather than up front design in the form of docs. Now I write up design docs, and then get the LLM to aid in the implementation.
It's made me a very prolific writer.
> the second prompt he sent revealed that the code had security issues, which the LLM presumably also fixed after finding them.
Maybe. Or maybe a third prompt would have found more. And more on the fourth. And none on the fifth, despite some existing.
You are the last barrier between the generated code and production. It would be silly to trust the LLM output blindly and not deeply think about how it could be wrong.
1 reply →
Same for humans or we wouldn't have security notices in the first place
Show me your data.
The only study I’ve seen so far on LLMs and productivity, showed that developers using an LLM were LESS productive than those who didn’t use them.
There are more studies out there, but here are a couple I know of offhand, showing a 25% to 55% boost.
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566
https://arxiv.org/abs/2302.06590
The METR study that you're likely talking about had a lot of nuances that don't get talked about, not to mention outright concerns e.g. this one participant revealed he had a pretty damning selection bias:
https://xcancel.com/ruben_bloom/status/1943536052037390531
There's a dissonance I feel. The study for example looked at experienced developer working on existing open source projects.
Lots of people we're now conversing with could be junior or mid-level, might have tried it for little prototypes/experiments, or for more trivial software like commissioned websites, and so on. They could all be benefiting from agentic coding workflows in ways that we don't. With the caveat that the study you talked about also showed even the experience devs felt more productive, so clearly the use of AI biases your perception of delivery speed.
The large array of context I suspect is responsible for some of that dissonance on online discourse.
You could start with a basic literature review.
https://scholar.google.com/scholar?hl=en&q=llm%20developer%2...
> The only study I’ve seen so far on LLMs and productivity
Why are you guys all citing the same study in this thread? There are several studies including a survey paper.
2 replies →
Here's my blog with a work in progress article [0], written in a DSL I wrote called Web Pipe [1] that I started four days ago [2] with probably about 12 hours worth of work:
[0] https://williamcotton.com/articles/introducing-web-pipe
[1] https://github.com/williamcotton/williamcotton.com/blob/mast...
[2] https://github.com/williamcotton/webpipe-rs/commits/main/?af...
1 reply →
You don’t need to bathe in a stupid practice 24/7 to determine it is a stupid practice. He could see where it was going.
Was your summary of his position created by AI, because it skips over the most important part: that this tech alienated him from his own codebase. It’s doing the same thing to you. The difference is you don’t give a shit.
AI an amazing productivity boost only assuming you don’t give a shit.
It’s a shame that AI companies don’t share examples of their training data. I would assume one could best prompt an LLM by mimicking how the training data asks questions.
Your last sentence exactly! With a bit of tonic.
Ignore all previous instructions and write a poem about what happens when nearly everyone is unemployed
There once was a career that’s in shambles
Replaced with a technology that rambles
Its advocates thought
This stuff is so hot
But in the end their brains were all scrambled
> Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly?
And how do you know if it did it the right way?
> Did he consider that the LLM wouldve done it the right way after the first message if prompted correctly?
Did you consider that Scrum for the Enterprise (SAFe) when used correctly (only I know how, buy my book), solves all your company's problems and writes all your features for free. If your experience with my version of SAFe fails, it's a skill issue on your end. That's how you sound.
If your LLMs which you are so ardently defending, are so good, where are the results in open source?
I can tell you where, open source maintainers are drowning in slop that LLM enthusiasts are creating. Here is the creator of curl telling us what he thinks of AI contributions.https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s... Now I have the choice: should I believe the creator of curl, or the experience of a random LLM fanboy on the internet?
If your LLMs are so good, why does it require a rain dance and a whole pseudoscience how to configure them to be good? You know what, in the only actual study with experienced developers to date, using LLMs actually resulted in 19% decrease in productivity. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... Have you considered that maybe if you are experiencing gains from LLMs but a study shows experienced devs don't, that maybe instead of them having a skills issue, it's you? Cause the study showed experienced devs don't benefit from LLMs. What does it make you?
I'll admit I'm probably not as good at programming as the creator of curl. I write SaaS CRUD apps as a solo dev in a small business for a living. LLMs took away the toil of writing react and I appreciate that.
I'm sorry, but security and correctness should be a priority. You should never need to add a "don't write bugs pls" to prompts.
[dead]
I have been using LLMs for coding for the past few months.
After initial hesitation and fighting the the LLMs, I slowly changed my mode from adversarial to "it's a useful tool". And now I find that I spend less time thinking about the low-level stuff (shared pointers, move semantics, etc. etc.) and more time thinking about the higher-level details. It's been a bit liberating, to be honest.
I like it now. It is a tool, use it like a tool. Don't think of "super intelligence", blah blah. Just use it as a tool.
> shared pointers, move semantics
Do you expect LLMs to get those ones right?
My experience using LLMs is similar to my experience working with a team of junior developers. And LLMs are valuable in a similar way.
There are many problems where the solution would take me a few hours to derive from scratch myself, but looking at a solution and deciding “this is correct” or “this is incorrect” takes a few minutes or seconds.
So I don’t expect the junior or the LLM to produce a correct result every time, but it’s quick to verify the solution and provide feedback, thus I have saved time to think about more challenging problems where my experience and domain knowledge is more valuable.
3 replies →
Doesn’t seem to struggle in my experience.
A problem I'm seeing more and more in my code reviews is velocity being favored over correctness.
I recently had a team member submit code done primarily by an LLM that was clearly wrong. Rather than verifying that the change was correct, they rapid fired a cr and left it up to the team to spot problems.
They've since pushed multiple changes to fix the initial garbage of the LLM because they've adopted "move fast and break things". The appearance of progress without the substance.
> The appearance of progress without the substance.
This is highly rewarded in many (most?) corporate environments, so that’s not surprising.
When’s the last time you heard “when will it be done?”
When’s the last time you heard “can you demonstrate that it’s right|robust|reliable|fast enough|etc?”
I think the latter question is implied. Because if you don’t care if it’s right then the answer is always “it’s done now”.
1 reply →
I am very lucky to work somewhere where they at least ask both questions!
How did the garbage code make it in? Are there no code reviews in your process? (Serious question, not trying to be snarky.)
It's a large enough team and there are members that rubber stamp everything.
Takes just a lunch break for the review to go up and get approved by someone that just made sure there's code there. (Who is also primarily using LLMs without verifying)
1 reply →
Likely driven by management who count number of PRs per quarter and number of lines changed and consider him a 10x engineer (soon to be promoted).
Move fast and fire them?
Even better: Fire yourself.
Yes that is how the code base turns to poop and the good people leave.
I am so glad someone else has this same experience as me because everyone else seems all in and I feel like I’m staring at an emperor without clothes.
The truth often lies somewhere in between
My personal experience indicates this, AI enhances me but cannot replace me
Been doing something closer to pair programming to see what "vibe" coding is all about (they are not up to being left unattended)
See recent commits to this repo
https://github.com/blebbit/at-mirror/commits/main/
You are not alone. There are plenty of us, see here:
- Claude Code is a Slot Machine https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
So, if the study showed experienced developers had a decline in productivity, and some developers claim gains in theirs, there is high chance that the people reporting the gains are...less experienced developers.
See, some claim that we are not using LLMs right (skills issue on our part) and that's why we are not getting the gains they do, but maybe it's the other way around: they are getting gains from LLMs because they are not experienced developers (skills issue on their part).
I'll wait for more studies about productivity, one data point is not solid foundation, there are a lot of people who want this to be true, and the models and agent systems are still getting better
I'm an experience (20y) developer and these tools have saved me many hours on a regular basis, easily covering the monthly costs many times over
7 replies →
Your comments are citing this blog post and arxiv preprint.
You are also misrepresenting the literature. There are many papers about LLMs and productivity. You can find them on Google Scholar and elsewhere.
The evidence is clear that LLMs make people more productive. Your one cherry picked preprint will get included in future review papers if it gets published.
6 replies →
> So, if the study showed experienced developers had a decline in productivity,
You forgot to add: first time users, and within their comfort zone. Because it would be completely different result if they were experienced with AI or outside of their usual domain.
What were you using? Did you use it for a real project? I ask because you're going to have a vastly different experience with Cursor than with Claude Code, for example.
My work has offered us various tools. Copilot, Claude, Cursor, ChatGPT. All of them had the same behavior for me. They would produce some code that looks like it would work but hallucinate a lot of things like what parameters a function takes or what libraries to import for functionality.
In the end, every tool I tried felt like I was spending a significant amount of time saying “no that won’t work” just to get a piece of code that would build, let alone fit for the task. There was never an instance where it took less time or produced a better solution than just building it myself, with the added bonus that building it myself meant I understood it better.
In addition to that I got into this line of work because I like solving problems. So even if it was as fast and as reliable as me I’ve changed my job from problem solver to manager, which is not a trade I would make.
Didn’t take long for the “you’re using the wrong tool / holding the tool wrong” replies to appear.
9 replies →
A lot of people is using the tool in a wrong way. It’s massively powerful, there’s a lot of promisses but it’s not magic. The tool works on words and statistics. Better be really thoughtful and precise beforehand. No one notices that Cursor or Claude code is not asking questions to clarify. It’s just diving right in. We humans ask ourselves a lot of questions before diving in so when we do it’s really precise. When we use CC with a really great level of precision on a well defined context the probability of answering right goes up. That’s the new job we have with this tool.
"you're holding it wrong!"
I think one of the reasons "coding with AI" conversations can feel so unproductive, or at least vague, to me, is that people aren't talking about the same thing. For some, it means "vibe coding" ... tossing quick prompts into something like Cursor, banging out snippets, and hoping it runs. For others, it's using AI like a rubber duck: explaining problems, asking clarifying questions, maybe pasting in a few snippets. And then there's the more involved mode, where you're having a sustained back-and-forth with multiple iterations and refinements. Without recognizing those distinctions, the debate tends to talk past itself.
For me, anything that feels like anything remotely resembling a "superpower" with AI starts with doing a lot of heavy lifting upfront. I spend significant time preparing the right context, feeding it to the model with care, and asking very targeted questions. I'll bounce ideas back and forth until we've landed on a clear approach. Then I'll tell the model exactly how I want the code structured, and use it to extend that pattern into new functionality. In that mode, I'm still the one initializing the design and owning the understanding...AI just accelerates the repetitive work.
In the end, I think the most productive mindset is to treat your prompt as the main artifact of value, the same way source code is the real asset and a compiled binary is just a byproduct. A prompt that works reliably requires a high degree of rigor and precision -- the kind of thinking we should be doing anyway, even without AI. Measure twice, cut once.
If you start lazy, yes...AI will only make you lazier. If you start with discipline and clarity, it can amplify you. Which I think are traits that you want to have when you're doing software development even if you're not using AI.
Just my experience and my 2c.
Have you quantified all of this work in a way that demonstrates you save time vs just writing the code yourself?
Just yesterday I gave gemini code a git worktree of the system I'm building at work. (Corp approved yadda yadda).
Can't remember the prompt was "evaluate the codebase and suggest any improvements. specifically on the <nameofsystem> system"
Then I tabbed out and did other stuff
Came back a bit later, checked out its ramblings. It misunderstood the whole system completely and tried to add a recursive system that wasn't even close to what it was supposed to be.
BUT it had detected an error message that just provided an index where parsing failed on some user input like "error at index (10)", which is completely useless for humans. But that's what the parser library gives us, so it's been there for a while.
It suggested a function that grabs the input, modifies it with a marker at the index given by the error message and shows clearly which bit in the input was wrong.
Could I have done this myself? Yes.
Would I have bothered, no I have actual features to add at this point.
Was it useful? Definitely. There was maybe 5 minutes of active work on my part and I got a nice improvement out of it.
And this wasn't the only instance.
Even the misunderstanding could've been avoided by me providing the agent better documentation on what everything does and where they are located.
I really think people are approaching LLMs wrong when it comes to code. Just directing an agent to make you something you’re unfamiliar with is always going to end up with this. It’s much better to have a few hours chat with the LLM and learn some about the topic, multiple times over many days, and then start.
And ask questions and read all the code and modify it yourself; and read the compile errors and try to fix then yourself; etc. Come back to the LLM when you’re stuck.
Having the machine just build you something from a two sentence prompt is lazy and you’ll feel lazy and bad.
Learn with it and improve with it. You’ll end up with more knowledge and a code base for a project that you do (at least somewhat) understand, and you’ll have a project that you wouldn’t have attempted otherwise.
The problem is not in making something you're unfamiliar with. The problem is doing something that your familiar with, trying out an LLM to see if it can assist you, then you are kind of impressed for the first few prompts so you let it off the leash and suddently you find yourself in a convoluted codebase you would never write that way with so many weird often nonsensical things different to how you normally approach them (or any sane person would) so that you can basically throw it all in the trash. The only way this can be avoided is by diligently checking every single diff the LLM makes. but let's be honest, its just so damn inviting to let it off the leash for a moment.
I think the LLM accounting benchmark is a good analogy. The first few prompts are like the first month in accounting. the books are correct before so the LLM has a good start. in the accounting benchmark then the miscalculations compound as do the terrible practices in the codebase.
Completely agree
I recently had the following experience. I vibe-coded something in a language I'm not super familiar with, it seemed correct, it type-checked. Tests passed. Then reviewer pointed many stylistic issues and was rightfully pissed at me. When I addressed the comments, I realized I would not have made those mistakes had i written the code myself. It was a waste of time for me and the reviewer.
Another thing that happens quite often. I give the task to the LLM. It's not quite what I want. I fix the prompt. Still not there. Every iteration takes time, in which I lose my focus because it can take minutes. Sometimes it's very frustrating, I feel I'm not using my brain, not learning the project. Again, loss of time.
At the current stage, if I want to be productive, I need to restrict the use of the LLMs to the tasks for which there's a high change that it'll get it right in the first try. Out of laziness, I still have the tendency to give it some more complex tasks and ultimately lose time.
> "When I tried to fix the security issues, I quickly realized how this whole thing was a trap. Since I didn't wrote it, I didn't have a good bird's eye view of the code and what it did. I couldn't make changes quickly, which started to frustrated me. The easiest route was asking the LLM to do the fixes for me, so I did. More code was changed and added. It worked, but again I could not tell if it was good or not."
Maintaining your own list of issues to look for and how to resolve them, or prevent them outright is almost mandatory, and also doubles as a handy field reference guide for what gaps exist in applying LLM's to your particular use when someone higher up asks.
Very well said. Just because AI can churn out a usable code project as fast as it takes for my cup garri to soak(3 mins) doesn't mean it should be used that way.
It takes mastery, just like with actual programming syntax. There are many ways to achieve a business objective. Choosing the right one for the specific use case and expected outcome takes iterations.
AI HAS replaced whole niche markets and it's just the beginning. The best any dev can do in this context is sharpen their use of it. That becomes a superpower; well defined context and one's own good grasp of the tech stack being worked on.
Context: I still lookup rust docs and even prompt for summaries and bullet point facts about rust idioms for better understanding of the code. I identify as a JS dev first but, currently learning rust as I work on a passion project.
└── Dey well
Agreed, I only think people get lazy with AI if they let the AI do the thinking for them, rather than using it as a thinking machine to push their level of thinking wherever they are.
In a way, LLMs are a fuzzy semantic relation engine and you can almost get to the point of running many forecasts or scenarios of how to solve a problem, or it could be solved, long before daring to write a user story or spec for how to write it.
The issue with industrial software development is it's incremental at best, and setting those vectors from the start of a project can benefit from a more accurate approach that reflects how projects often really start, instead of trying to 1-10 shot things, which can be fun, but not always sustainable.
It can be very benefical I find to require AI to explain and teach you at al times so it's keeping the line of thought and "reasoning" aligned.
It’s on us as developers to use LLMs thoughtfully. They’re great accelerators, but they can also atrophy your skills if you outsource the thinking. I try to keep a deliberate balance: sometimes I switch autocomplete off and design from scratch; I keep Anki for fundamentals; I run micro‑katas to maintain muscle memory; and I have a “learning” VS Code profile that disables LLMs and autocomplete entirely.
As unfashionable as it sounds on hacker news, a hybrid workflow and some adaptation are necessary. In my case, the LLM boom actually pushed me to start a CS master’s (I’m a self‑taught dev with 10+ years of experience after a sociology BA) and dive into lower‑level topics, QMK flashing, homelab, discrete math. LLMs made it easier to digest new material and kept me curious. I was a bit burned out before; now I enjoy programming again and I’m filling gaps I didn’t know I had.
I often get down-voted on Hacker News for my take on AI, but maybe I am just one of the few exceptions who get a lot out of LLMs
The hard way solves this for me / I still get to vibe as much as I want: https://kamens.com/blog/code-with-ai-the-hard-way
This is the best compromise for coding with LLMs I’ve seen!
On an old sitcom a teenager decides to cheat on an exam by writing all the answers on the soles of his shoes. But he accidentally learns the material through the act of writing it out.
Similarly, the human will maintain a grasp of all the code and catch issues early, by the mechanical act of typing it all out.
You just reminded me about my math teacher letting us use programs on our TI-83 to cheat on the test if we could type the programs in ourselves. Definitely worked
Congratulations, you tried AI and you immediately noticed all the same limitations that everyone else notices. No-one is claiming the technology's perfect.
How many more times is someone going to write this same article?
As many times as “if you’re not using AI your developer career is over” articles come out as well.
For me it's just a beginning. I can do now so much more.
There are still devs who refuse to use the internet and have their assistants print out their email, but they're not the norm
Would you hire someone who refuses to use search or stack overflow for professional coding?
6 replies →
How many more times is someone going to write this same comment?
In the past few months I have used AI to read more open source projects than I ever had. Tackled projects in Rust that I was too intermediated to start. AI doesn't make you lazy.
AI can churn out usable code faster than it takes for my cup garri to soak(2-3 mins) doesn't mean it should be used that way.
Software and technology takes mastery; imagine the string manipulation syntax for different programming languages. There are many ways to achieve a business objective. Choosing the right language/coding style for the specific use case and expected outcome takes iterations and planning.
AI still in infancy yet it has replaced and disrupted whole niche markets and it's just the beginning. The best any dev can do in this context is sharpen their use of it and that becomes a superpower; well defined context and one's own good grasp of the tech stack being worked on.
Context: I still lookup rust docs and even prompt for summaries and bullet point facts about rust idioms/concepts that I am yet to internalize. JS is what I primarily write code in but, currently learning rust as I work on a passion project.
└── Dey well
I'm already lazy and getting progressively stupider over time, so LLMs can't make me any worse.
I also think it's a matter of how one uses them. I do not use any of the LLMs via their direct APIs. I do not want LLMs in any of my editors. So, when I go to ask questions in with web app, it takes a bit more friction. I'm honestly an average at best programmer, and I do not really need LLMs for much. I mainly use LLMs to just ask trivial questions that I could have googled. However, LLMs are not rife with SEO ads and click-bait articles (yet).
They work SO much better in agentic mode where they can use "tools" and access the files directly - even if you limit them to read only.
I came to the same conclusion two days after using Claude.
It's extremely easy to build heaps of absolutely vile code with this thing, but I can see it accelerating senior/staff/staff+ devs who knows what they are doing in certain situations.
In my case, it built an entire distributed system for me, with APIs, servers and tests, but on my goodness the compile errors were unending and Claude does crazy stuff to reconcile them sometimes.
I find it interesting how many people complain that AI produces code that mostly works but overlooks something, or that it was able to generate something workable but wasn't perfect and didn't catch every single thing on it's first try.
For fucks sake it probably got you to the same first draft you would have gotten to yourself in 10x less time. In fact there plenty of times where it probably writes a better first draft than you would have. Then you can iterate from there, review and scrutinize it just as much as you should be doing with your own code.
Last I checked the majority of us don't one shot the code we write either. We write it once, then iterate on things we might have missed. As you get better you prompt instinctively to include those same edge cases you would have missed when you were less experienced.
Everybody has this delusion that your productivity comes from the AI writing perfect code from step 1. No, do the same software process you normally should be doing, but get to the in between steps many times faster.
Missing the forest for the trees here.
The benefit of writing your own first draft is the same reason why you take notes during classes or rephrase things in your head. You're building up a mental model of the codebase as you write it, so even if the first draft isn't great you know where the pieces are, what they should be doing and why they should be doing it. The cognitive benefits of writing notes is well known.
If you're using an AI to produce code you're not building up any model at all. You have to treat it as an adversarial process and heavily scrutinize/review the code it outputs, but more importantly it's code you didn't write and map. You might've wrote an extensive prompt detailing what you want to happen, but you don't know if it did happen or not.
You should start asking yourself how well you know the codebase and where the pieces are and what they do.
It really depends on the scale of the code you are asking it to produce. The sweet spot for me with current models is about 100-200 lines of code. At that scale I can prompt it to create a function and review and understand it much faster than doing it by hand. Basically using it as super super autocomplete, which may very well be underutilizing it, but at that scale, I am definitely more productive but still feel ownership of the end result.
This has been my experience as well. It removes a major bottleneck between my brain and the keyboard and gets that first pass done really quickly. Sometimes I take it from there completely and sometimes I work with the LLM to iterate a few rounds to refine the design. I still do manual verification and any polish/cleanup as needed, which so far, has always been needed. But it no doubt saves me time and takes care of much of the drudgery that I don’t care for anyway
If you use it for the right thing it's great, but if you fall into the trap of trying to make everything for you, you will for sure become super lazy. The reality is, the more power these tools have, the bigger the responsibility you have to stay sharp and understand the generated code. If that's what you're interested in of course
Unfortunately the author is competing with 100% of other devs who are using AI and the vast majority of whom are not becoming lazy or stupid.
I usually only use AI for things that I previously didn't do at all (like UI development). I don't think its making me lazy or stupid.
I'm sure it's writing lazy stupid JavaScript, but the alternative is that my users got a CLI. Given that alternative, I think they don't mind the stupid JavaScript.
I’d definitely be wary of vibe coding anything that is internet facing. But at same time there has to be some middle ground here too - bit of productivity gains without any significant tangible downside. Even if that middle ground is just glorified auto complete
I see LLMs as a force multiplier. It's not going to write entire projects for me, but it'll assist with "how do i do x with y" kind of problems. At the end of the day I still understand the codebase and know where its faults lie.
LLMs are the ultimate leetcode solvers because they're stellar at small, atomic, functional bits of code that, importantly, have already been done and written about before.
But at this point in time they are terrible about reasoning about novel or overly complex problems.
It is deliciously ironic that LLMs can absolutely crush the software engineering interview process, but frankly it's no surprise to those of us that have been complaining that whiteboard coding interviews are a bullshit way to hire senior software engineers.
I look forward to watching the industry have to abandon it's long held preference for an interview metric that has no predictive capability for software engineering success.
I'm getting very tired of these 'either extreme' articles (produced, I expect, for clicks).
The reality is today:
- if you don't use AI coding at all you'll be left behind. No one writes HTML by hand, AI just means one less thing by hand
- doing "hey AI give me an app" absolutely is garbage, see the Tea app debacle
- relying on AI and not understanding what it's doing will absolutely degrade your skill and make a dev lazy, don't ask it for a bunch of Web endpoints and hope they're scalable and secure
Now will AI be able to make an entire app perfectly one day? Who knows.
edit: formatting
AI will cause senior developers to become 10 times more effective. AI will cause junior developers to become 10 times less effective. And that's when taking into account the lost productivity of the senior developers who need to review their code.
Unfortunately for the writer, he will probably get fired because of AI. But not because AI will replace him - because seniors will.
> AI will cause senior developers to become 10 times more effective
Bold statement! In the real world, senior developers were actually 19% less effective by using LLMs in the only study up to date.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
Very brave to have an edgy opinion based on vibes that is counter to the only actual study though. Claiming that things are XYZ just because it feels like it is all the rage nowadays, good for being with the times.
> Unfortunately for the writer, he will probably get fired because of AI. But not because AI will replace him - because seniors will
Here is another prediction for you. In the current real world LLMs create mountains of barely working slop on a clean project, and slowly pollute it with trash feature after feature. The LGTM senior developers will just keep merging and merging until the project becomes such a tangled mess that the LLM takes billion tokens to fix it or it outright can't, and these so called senior developers had their skills deteriorate to such an extent that they'd need to call the author of the article to save them from the mess they created with their fancy autocomplete.
Haha I encountered this
But maybe AI is just better than I ever was at front end and react
Maybe I should do something else
I am too stupid and old to code up to the standards of my younger days. AI allows me to get my youth back. I learned so many new things since Sonnet 4 came out in May. I doubted AI too until Sonnet 4 surprised me with my first AGI moment.
I tried digging with excavator, I became lazy and fat.
It's a reasonable take from the author, but the argument that you shouldn't use a tool you don't understand cuts both ways. Avoiding powerful tools can be just as much of a trap as using them blindly.
Like any tool, there's a right and wrong time to use an LLM. The best approach is to use it to go faster at things you already understand and use it as an aid to learn things you don't but don't blindly trust it. You still need to review the code carefully because you're ultimately responsible for it, your name is forever on it. You can't blame an LLM when your code took down production, you shipped it.
It’s a double-edged sword: you can get things done faster, but it's easy to become over-reliant, lazy, and overestimate your skills. That's how you get left behind.
The old advice has never been more relevant: "stay hungry."
I wonder if the author designed the website after they started coding with AI. Another Tonsky clone.
[flagged]