Comment by 63stack

19 hours ago

This reads like shilling/advertisement.. Coding AIs are struggling for anything remotely complex, make up crap and present it as research, write tests that are just "return true", and won't ever question a decision you make.

Those twenty engineers must not have produced much.

I think part of what is happening here is that different developers on HN have very different jobs and skill levels. If you are just writing a large volume of code over and over again to do the same sort of things, then LLMs probably could take your job. A lot of people have joined the industry over time, and it seems like the intelligence bar moved lower and lower over time, particularly for people churning out large volumes of boilerplate code. If you are doing relatively novel stuff, at least in the sense that your abstractions are novel and the shape of the abstraction set is different from the standard things that exist in tutorials etc online, then the LLM will probably not work well with your style.

So some people are panicking and they are probably right, and some other people are rolling their eyes and they are probably right too. I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed. I am skeptical this will happen though, as there doesn’t seem to be a way around the problem of the giant indigestible hairball (I.e as you have more and more boilerplate it becomes harder to remain coherent).

  • Indeed, discussions on LLMs for coding sound like what you would expect if you asked a room full of people to snatch up a 20 kg dumbbell once and then tell you if it's heavy.

    > I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed.

    Cough front-end cough web cough development. Admittedly, original patterns can still be invented, but many (most?) of us don't need that level of creativity in our projects.

  • That’s a very good point I hadn’t heard explained that way before. Makes a lot of sense and explains a lot of the circular debates about AI that happen here daily.

  • > If you are just writing a large volume of code over and over again

    But why would you do that? Wouldn't you just have your own library of code eventually that you just sell and sell again with little tweaks? Same money for far less work.

    • People, at least novice developers, tend to prefer fast and quick boilerplate that makes them look effective, over spending one hour sitting just thinking and designing, then implementing some simple abstraction. This is true today, and been true for as long as I've been in programming.

      Besides, not all programming work can be abstracted into a library and reused across projects, not because it's technically infeasible, but because the client doesn't want to, cannot for legal reasons or the developer process at the client's organization simply doesn't support that workflow. Those are just the reasons from the top of my head, that I've encountered before, and I'm sure there is more reasons.

  • > different developers on HN have very different jobs and skill levels.

    Definitely this. When I use AIs for web development they do an ok job most of the time. Definitely on par with a junior dev.

    For anything outside of that they're still pretty bad. Not useless by any stretch, but it's still a fantasy to think you could replace even a good junior dev with AI in most domains.

    I am slightly worried for my job... but only because AI will keep improving and there is a chance it will be as good as me one day. Today it's not a threat at all.

    • Yea, LLMs produce results on par with what I would expect out of a solid junior developer. They take direction, their models act as the “do the research” part, and they output lots of code: code that has to be carefully scrutinized and refined. They are like very ambitious interns who never get tired and want to please, but often just produce crap that has to be totally redone or refactored heavily in order to go into production.

      If you think LLMs are “better programmers than you,” well, I have some disappointing news for you that might take you a while to accept.

      7 replies →

  • Absolutely this, and TFA touches on the point about natural language being insufficiently precise:

    AI can write you an entire CRUD app in minutes, and with some back-and-forth you can have an actually-good CRUD app in a few hours.

    But AI is not very good (anecdotally, based on my experience) at writing fintech-type code. It's also not very good at writing intricate security stuff like heap overflows. I've never tried, but would certainly never trust it to write cryptography correctly, based on my experience with the latter two topics.

    All of the above is "coding", but AI is only good at a subset of it.

    • Generating CRUD is like solving cancer in mice, we already have a dizzying array of effective solutions… Ruby on Rails, Access 97, model first ORMs with GUI mappers. SharePoint lets anyone do all the things easily.

      The issue is and always has been maintenance and evolution. Early missteps cause limitations, customer volume creates momentum, and suddenly real engineering is needed.

      I’d be a lot more worried about our jobs if these systems were explaining to people how to solve all their problems with a little Emacs scripting. As is they’re like hyper aggressive tech sales people, happy just to see entanglements, not thinking about the whole business cycle.

      1 reply →

    • > and with some back-and-forth you can have an actually-good CRUD app in a few hours

      Perhaps the debate is on what constitutes "actually-good". Depends where the bar is I suppose.

  • >at least in the sense that your abstractions are novel and the shape of the abstraction set is different from the standard things that exist

    People shouldn't be doing this in the first place. Existing abstractions are sufficient for building any software you want.

    • > Existing abstractions are sufficient for building any software you want.

      Software that doesn't need new abstractions is also already existing. Everything you would need already exists and can be bought much more cheaply than you could do it yourself. Accounting software exists, unreal engine exists and many games use it, why would you ever write something new?

      1 reply →

    • I'm supposing that nobody who has a job is producing abstractions that are always novel, but there may be people who find abstractions that are novel for their particular field because it is something most people in that field are not familiar with, or that come up with novel abstractions (infrequently) that improve on existing ones.

    • The new abstraction is “this corporation owns this IP and has engineers who can fix and extend it at will”. You can’t git clone that.

      But if there is something off the shelf that you can use for the task at hand? Great! The stakeholders want it to do these other 3000 things before next summer.

No it doesn’t read like shilling and advertisement, it’s tiring hearing people continually dismiss coding agents as if they have not massively improved and are driving real value despite limitations and they are only just getting started. I’ve done things with Claude I never thought possible for myself to do, and I’ve done things where Claude made the whole effort take twice as long and 3x more of my time. It’s not like people are ignoring the limitations, it’s that people can see how powerful the already are and how much more headroom there is even with existing paradigms not to mention the compute scaling happening in 26-27 and the idea pipeline from the massive hoarding of talent.

  • When prices go down or product velocity goes up we'll start believing in the new 20x developer. Until then, it doesn't align with most experiences and just reads like fiction.

    You'll notice no one ever seems to talk about the products they're making 20x faster or cheaper.

  • The paradigm shift hit the world like a wall. I know entire teams where the manager thinks AI is bullshit and the entire team is not allowed to use AI.

    I love coding. But reality is reality and these fools just aren’t keeping pace with how fast the world is changing.

    • Or we're in another hype cycle and billions of dollars are being pumped in to sustain the current bubble with a lot of promises about how fast the world is changing. Doesn't mean AI can't be a useful tool.

  • > I’ve done things with Claude I never thought possible for myself to do,

    That's the point champ. They seem great to people when they apply them to some domain they are not competent it, that's because they cannot evaluate the issues. So you've never programmed but can now scaffold a React application and basic backend in a couple of hours? Good for you, but for the love of god have someone more experienced check it before you push into production. Once you apply them to any area where you have at least moderate competence, you will see all sorts of issues that you just cannot unsee. Security and performance is often an issue, not to mention the quality of code....

    • > So you've never programmed but can now scaffold a React application and basic backend in a couple of hours?

      Ahaha, weren’t you the guy who wrote an opus about planes? Is this your baseline for “stuff where LLMs break and real engineering comes into the room”? There’s a harsh wake up call for you around the corner.

      5 replies →

    • What you wrote here was relevant about 9 months ago. It’s now outdated. The pace and velocity of improvement of Ai can only be described as violent. It is so fast that there are many people like you who don’t get it.

      20 replies →

    • Seems fine, works, is fine, is better than if you had me go off and write it on my own. You realize you can check the results? You can use Claude to help you understand the changes as you read through them? I mean I just don’t get this weird “it makes mistakes and it’s horrible if you understand the domain that it is generating over” I mean yes definitely sometimes and definitely not other times. What happens if I DONT have someone more experienced to consult with or that will ignore me because they are busy or be wrong because they are also imperfect and not focused. It’s really hard to be convinced that this point of view is not just some knee jerk reaction justified post hoc

      6 replies →

    • This is remarkably dismissive and comes across as arrogant. In reality they assist many people with expert skills in a domain in getting things done in areas they are competent in, without getting bogged down in tedium.

      They need a heavy hand to police to make sure they do the right thing. Garbage in, garbage out.

      The smarter the hand of the person driving them, the better the output. You see a problem, you correct it. Or make them correct it. The stronger the foundation they're starting from, the better the production.

      It's basically the opposite of what you're asserting here.

This is completely wrong. Codex 5.2 and Claude Sonnet 4.5 don't have any of these issues. They will regularly tell you that you're wrong if you bother to ask them and they will explain why and what a better solution is. They don't make up anything. The code they produce is noticeably more efficient in LoC than previous models. And yes they really will do research, they will search the Internet for docs and articles as needed and cite their references inline with their answers.

You talk as if you haven't used a LLM since 2024. It's now almost 2026 and things have changed a lot.

  • With apologies, and not GP, but this has been the same feedback I've personally seen on every single model release.

    Whenever I discuss the problems that my peers and I have using these things, it's always something along the lines of "but model X.Y solves all that!", so I obediently try again, waste a huge amount of time, and come back to the conclusion that these things aren't great at generation, but they are fantastic at summarization and classification.

    When I use them for those tasks, they have real value. For creation? Not so much.

    I've stopped getting excited about the "but model X.Y!!" thing. Maybe they are improving? I just personally haven't seen it.

    But according to the AI hypers, just like with every other tech hype that's died over the past 30 years, "I must just be doing it wrong".

    • A lot of people are consistently getting their low expectations disproven when it comes to progress in AI tooling. If you read back in my comment history, six months ago I was posting about how AI is over hyped BS. But I kept using it and eventually new releases of models and tools solved most of the problems I had with them. If it has not happened for you yet then I expect it will eventually. Keep up with using the tools and models and follow their advancements and I think you'll eventually get to the point where your needs are met

I would say while LLMs do improve productivity sometimes, I have to say I flatly cannot believe a claim (at least without direct demonstration or evidence) that one person is doing the work of 20 with them in december 2025 at least.

I mean from the off, people were claiming 10x probably mostly because it's a nice round number, but those claims quickly fell out of the mainstream as people realised it's just not that big a multiplier in practice in the real world.

I don't think we're seeing this in the market, anywhere. Something like 1 engineer doing the job of 20, what you're talking about is basically whole departments at mid sized companies compressing to one person. Think about that, that has implications for all the additional management staff on top of the 20 engineers too.

It'd either be a complete restructure and rethink of the way software orgs work, or we'd be seeing just incredible, crazy deltas in output of software companies this year of the type that couldn't be ignored, they'd be impossible to not notice.

This is just plainly not happening. Look, if it happens, it happens, 26, 27, 28 or 38. It'll be a cool and interesting new world if it does. But it's just... not happened or happening in 25.

  • It's entirely dependent on the type of code being written. For verbose, straightforward code with clear cut test scenarios, one agent can easily 24/7 the work of 20 FT engineers. This is a best case scenario.

    Your productivity boost will depend entirely on a combination of how much you can remove yourself from the loop (basically, the cost of validation per turn) and how amenable the task/your code is to agents (which determines your P(success)).

    Low P(success) isn't a problem if there's no engineer time cost to validation, the agent can just grind the problem out in the background, and obviously if P(success) is high the cost of validation isn't a big deal. The productivity killer is when P(success) is low and the cost of validation is high, these circumstances can push you into the red with agents very quickly.

    Thus the key to agents being a force multiplier is to focus on reducing validation costs, increasing P(success) and developing intuition relating to when to back off on pulling the slot machine in favor of more research. This is assuming you're speccing out what you're building so the agent doesn't make poor architectural/algorithmic choices that hamstring you down the line.

    • > It's entirely dependent on the type of code being written. For verbose, straightforward code with clear cut test scenarios, one agent can easily 24/7 the work of 20 FT engineers. This is a best case scenario.

      So the "verbose, straightforward code with clear cut test scenarios" is already written by a human?

    • Respectfully, if I may offer constructive criticism, I’d hope this isn’t how you communicate to software developers, customers, prospects, or fellow entrepreneurs.

      To be direct, this reads like a fluff comment written by AI with an emphasis on probability and metrics. P(that) || that.

      I’ve written software used by a local real estate company to the Mars Perseverance rover. AI is a phenomenally useful tool. But be weary of preposterous claims.

      1 reply →

  • I would say it varies from 0x to a modest 2x. It can help you write good code quickly, but, I only spent about 20-30% of my time writing code anyway before AI. It definitely makes debugging and research tasks much easier as well. I would confidently say my job as a senior dev has gotten a lot easier and less stressful as a result of these tools.

    One other thing I have seen however is the 0x case, where you have given too much control to the llm, it codes both you and itself into pan’s labyrinth, and you end up having to take a weed wacker to the whole project or start from scratch.

    • Ok, if you're a senior dev, have you 'caught' it yet?

      Ask it a question about something you know well, and it'll give you garbage code that it's obviously copied from an answer on SO from 10 years ago.

      When you ask it for research, it's still giving you garbage out of date information it copied from SO 10 years ago, you just don't know it's garbage.

      3 replies →

  • > I mean from the off, people were claiming 10x probably mostly because it's a nice round number,

    Purely anecdotal, but I've seen that level of productivity from the vibe tools we have in my workplace.

    The main issue is that 1 engineer needs to have the skills of those 20 engineers so they can see where the vibe coding has gone wrong. Without that it falls apart.

  • Could be speed/efficiency was the wrong dimension to optimize for and its leading the industry down a bad path.

    An LLM helps most with surface area. It expands the breadth of possibilities a developer can operate on.

My experience is that you get out what you put in. If you have a well-defined foundation, AI can populate the stubs and get it 95% correct. Getting to that point can take a bit of thought, and AI can help with that, too, but if you lean on it too much, you'll get a mess.

And of course, getting to the point where you can write a good foundation has always been the bulk of the work. I don't see that changing anytime soon.

Ok, let's say the 20 devs claim is false [1]. What if it's 2? I'd still learn and use the tech. Wouldn't you?

[1] I actually think it might be true for certain kinds of jobs.

  • It's not 20 and it's not 2. It's not a person. It's a tool. It can make a person 100x more effective at certain specific things. It can make them 50% less effective at other things. I think, for most people and most things, it might be like a 25% performance boost, amortized over all (impactful) projects and time, but nobody can hope to quantify that with any degree of credibility yet.

  • Jevon's Paradox: more software will be produced, rather than fewer software engineers being employed.

I'd be willing to give you access to the experiment I mentioned in a separate reply (have a github repo), as far as the output that you can get for a complex app buildout.

Will admit It's not great (probably not even good) but it definitely has throughput despite my absolute lack of caring that much [0]. Once I get past a certain stage I am thinking of doing an A-B test where I take an earlier commit and try again while paying more attention... (But I at least want to get where there is a full suite of UOW cases before I do that, for comparison's sake.)

> Those twenty engineers must not have produced much.

I've been considered a 'very fast' engineer at most shops (e.x. at multiple shops, stories assigned to me would have a <1 multiplier for points[1])

20 is a bit bloated, unless we are talking about WITCH tier. I definitely can get done in 2-3 hours what could take me a day. I say it that way because at best it's 1-2 hours but other times it's longer, some folks remember the 'best' rather than median.

[0] - It started as 'prompt only', although after a certain point I did start being more aggressive with personal edits.

[1] - IDK why they did it that way instead of capacity, OTOH that saved me when it came to being assigned Manual Testing stories...

  • > Will admit It's not great (probably not even good) but it definitely has throughput

    Throughput without being good will just lead to more work down the line to correct the badness.

    It's like losing money on every sale but making up for it with volume.

  • > Will admit It's not great (probably not even good)

    You lost me here. Come back when you're proud of it.