Comment by mohsen1

1 day ago

I really really want this to be true. I want to be relevant. I don’t know what to do if all those predictions are true and there is no need (or very little need) for programmers anymore.

But something tells me “this time is different” is different this time for real.

Coding AIs design software better than me, review code better than me, find hard-to-find bugs better than me, plan long-running projects better than me, make decisions based on research, literature, and also the state of our projects better than me. I’m basically just the conductor of all those processes.

Oh, and don't ask about coding. If you use AI for tasks above, as a result you'll get very well defined coding task definitions which an AI would ace.

I’m still hired, but I feel like I’m doing the work of an entire org that used to need twenty engineers.

From where I’m standing, it’s scary.

More than any other effect they have LLMs breed something called "learned helplessness". You just listed a few things it may stay better than you at, and a few things that it is not better than you at and never will be.

Planning long running projects and deciding are things only you can do well!! Humans manage costs. We look out for our future. We worry. We have excitement, and pride. It wants you to think none of these things matter of course, because it doesn't have them. It says plausible things at random, basically. It can't love, it can't care, it won't persist.

WHATEVER you do don't let it make you forget that it's a bag of words and you are someing almost infinitely more capable, not in spite of human "flaws" like caring, but because of them :)

  • Yea I've been seeing very similar behavior from people. They think of themselves as static, unchanging, uncreative but view LLMs as some kind of unrelenting and inevitable innovative force...

    I think it's people's anxieties and fears about the uncertainty about the value of their own cognitive labor demoralizing them and making them doubt their own self-efficacy. Which I think is an understandable reaction in the face of trillion dollar companies frothing at the mouth to replace you with pale imitations.

    Best name I could think of calling this narrative / myth is people believing in "effortless AI": https://www.insidevoice.ai/p/effortless-ai

  • Plus I think I've almost never see so little competition for what I think are the real prizes! Everyone's off making copies of copies of copies of the same crappy infrastructure we already have. They're busy building small inconsequential side projects so they can say they built something using an LLM.

    • > They're busy building small inconsequential side projects

      Unironically, sending a program to build those for me have send me almost endless amount of time. I'm a pretty distracted individual, and pretty anal about my workflow/environment, so lots of times I've spent hours going into rabbit-holes to make something better, when I could have just sucked it up and do it the manual way instead, even if it takes mental energy.

      Now, I can still do those things, but not spend hours, just a couple of minutes, and come back after 20-30 minutes to something that lets me avoid that stuff wholesale. Once you start stacking these things, it tends to save a lot of time and more importantly, mental energy.

      So the programs by themselves are basically "small inconsequential side projects" because they're not "production worthy and web scale SaaS ready to earn money", but they help me and others who are building those things in a big way.

      6 replies →

This reads like shilling/advertisement.. Coding AIs are struggling for anything remotely complex, make up crap and present it as research, write tests that are just "return true", and won't ever question a decision you make.

Those twenty engineers must not have produced much.

  • I think part of what is happening here is that different developers on HN have very different jobs and skill levels. If you are just writing a large volume of code over and over again to do the same sort of things, then LLMs probably could take your job. A lot of people have joined the industry over time, and it seems like the intelligence bar moved lower and lower over time, particularly for people churning out large volumes of boilerplate code. If you are doing relatively novel stuff, at least in the sense that your abstractions are novel and the shape of the abstraction set is different from the standard things that exist in tutorials etc online, then the LLM will probably not work well with your style.

    So some people are panicking and they are probably right, and some other people are rolling their eyes and they are probably right too. I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed. I am skeptical this will happen though, as there doesn’t seem to be a way around the problem of the giant indigestible hairball (I.e as you have more and more boilerplate it becomes harder to remain coherent).

    • Indeed, discussions on LLMs for coding sound like what you would expect if you asked a room full of people to snatch up a 20 kg dumbbell once and then tell you if it's heavy.

      > I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed.

      Cough front-end cough web cough development. Admittedly, original patterns can still be invented, but many (most?) of us don't need that level of creativity in our projects.

    • That’s a very good point I hadn’t heard explained that way before. Makes a lot of sense and explains a lot of the circular debates about AI that happen here daily.

    • > If you are just writing a large volume of code over and over again

      But why would you do that? Wouldn't you just have your own library of code eventually that you just sell and sell again with little tweaks? Same money for far less work.

      1 reply →

    • > different developers on HN have very different jobs and skill levels.

      Definitely this. When I use AIs for web development they do an ok job most of the time. Definitely on par with a junior dev.

      For anything outside of that they're still pretty bad. Not useless by any stretch, but it's still a fantasy to think you could replace even a good junior dev with AI in most domains.

      I am slightly worried for my job... but only because AI will keep improving and there is a chance it will be as good as me one day. Today it's not a threat at all.

      8 replies →

    • Absolutely this, and TFA touches on the point about natural language being insufficiently precise:

      AI can write you an entire CRUD app in minutes, and with some back-and-forth you can have an actually-good CRUD app in a few hours.

      But AI is not very good (anecdotally, based on my experience) at writing fintech-type code. It's also not very good at writing intricate security stuff like heap overflows. I've never tried, but would certainly never trust it to write cryptography correctly, based on my experience with the latter two topics.

      All of the above is "coding", but AI is only good at a subset of it.

      5 replies →

    • >at least in the sense that your abstractions are novel and the shape of the abstraction set is different from the standard things that exist

      People shouldn't be doing this in the first place. Existing abstractions are sufficient for building any software you want.

      4 replies →

  • No it doesn’t read like shilling and advertisement, it’s tiring hearing people continually dismiss coding agents as if they have not massively improved and are driving real value despite limitations and they are only just getting started. I’ve done things with Claude I never thought possible for myself to do, and I’ve done things where Claude made the whole effort take twice as long and 3x more of my time. It’s not like people are ignoring the limitations, it’s that people can see how powerful the already are and how much more headroom there is even with existing paradigms not to mention the compute scaling happening in 26-27 and the idea pipeline from the massive hoarding of talent.

    • When prices go down or product velocity goes up we'll start believing in the new 20x developer. Until then, it doesn't align with most experiences and just reads like fiction.

      You'll notice no one ever seems to talk about the products they're making 20x faster or cheaper.

      44 replies →

    • The paradigm shift hit the world like a wall. I know entire teams where the manager thinks AI is bullshit and the entire team is not allowed to use AI.

      I love coding. But reality is reality and these fools just aren’t keeping pace with how fast the world is changing.

      1 reply →

    • > I’ve done things with Claude I never thought possible for myself to do,

      That's the point champ. They seem great to people when they apply them to some domain they are not competent it, that's because they cannot evaluate the issues. So you've never programmed but can now scaffold a React application and basic backend in a couple of hours? Good for you, but for the love of god have someone more experienced check it before you push into production. Once you apply them to any area where you have at least moderate competence, you will see all sorts of issues that you just cannot unsee. Security and performance is often an issue, not to mention the quality of code....

      44 replies →

  • I would say while LLMs do improve productivity sometimes, I have to say I flatly cannot believe a claim (at least without direct demonstration or evidence) that one person is doing the work of 20 with them in december 2025 at least.

    I mean from the off, people were claiming 10x probably mostly because it's a nice round number, but those claims quickly fell out of the mainstream as people realised it's just not that big a multiplier in practice in the real world.

    I don't think we're seeing this in the market, anywhere. Something like 1 engineer doing the job of 20, what you're talking about is basically whole departments at mid sized companies compressing to one person. Think about that, that has implications for all the additional management staff on top of the 20 engineers too.

    It'd either be a complete restructure and rethink of the way software orgs work, or we'd be seeing just incredible, crazy deltas in output of software companies this year of the type that couldn't be ignored, they'd be impossible to not notice.

    This is just plainly not happening. Look, if it happens, it happens, 26, 27, 28 or 38. It'll be a cool and interesting new world if it does. But it's just... not happened or happening in 25.

    • It's entirely dependent on the type of code being written. For verbose, straightforward code with clear cut test scenarios, one agent can easily 24/7 the work of 20 FT engineers. This is a best case scenario.

      Your productivity boost will depend entirely on a combination of how much you can remove yourself from the loop (basically, the cost of validation per turn) and how amenable the task/your code is to agents (which determines your P(success)).

      Low P(success) isn't a problem if there's no engineer time cost to validation, the agent can just grind the problem out in the background, and obviously if P(success) is high the cost of validation isn't a big deal. The productivity killer is when P(success) is low and the cost of validation is high, these circumstances can push you into the red with agents very quickly.

      Thus the key to agents being a force multiplier is to focus on reducing validation costs, increasing P(success) and developing intuition relating to when to back off on pulling the slot machine in favor of more research. This is assuming you're speccing out what you're building so the agent doesn't make poor architectural/algorithmic choices that hamstring you down the line.

      3 replies →

    • I would say it varies from 0x to a modest 2x. It can help you write good code quickly, but, I only spent about 20-30% of my time writing code anyway before AI. It definitely makes debugging and research tasks much easier as well. I would confidently say my job as a senior dev has gotten a lot easier and less stressful as a result of these tools.

      One other thing I have seen however is the 0x case, where you have given too much control to the llm, it codes both you and itself into pan’s labyrinth, and you end up having to take a weed wacker to the whole project or start from scratch.

      4 replies →

    • > I mean from the off, people were claiming 10x probably mostly because it's a nice round number,

      Purely anecdotal, but I've seen that level of productivity from the vibe tools we have in my workplace.

      The main issue is that 1 engineer needs to have the skills of those 20 engineers so they can see where the vibe coding has gone wrong. Without that it falls apart.

    • Could be speed/efficiency was the wrong dimension to optimize for and its leading the industry down a bad path.

      An LLM helps most with surface area. It expands the breadth of possibilities a developer can operate on.

  • This is completely wrong. Codex 5.2 and Claude Sonnet 4.5 don't have any of these issues. They will regularly tell you that you're wrong if you bother to ask them and they will explain why and what a better solution is. They don't make up anything. The code they produce is noticeably more efficient in LoC than previous models. And yes they really will do research, they will search the Internet for docs and articles as needed and cite their references inline with their answers.

    You talk as if you haven't used a LLM since 2024. It's now almost 2026 and things have changed a lot.

    • With apologies, and not GP, but this has been the same feedback I've personally seen on every single model release.

      Whenever I discuss the problems that my peers and I have using these things, it's always something along the lines of "but model X.Y solves all that!", so I obediently try again, waste a huge amount of time, and come back to the conclusion that these things aren't great at generation, but they are fantastic at summarization and classification.

      When I use them for those tasks, they have real value. For creation? Not so much.

      I've stopped getting excited about the "but model X.Y!!" thing. Maybe they are improving? I just personally haven't seen it.

      But according to the AI hypers, just like with every other tech hype that's died over the past 30 years, "I must just be doing it wrong".

      1 reply →

  • My experience is that you get out what you put in. If you have a well-defined foundation, AI can populate the stubs and get it 95% correct. Getting to that point can take a bit of thought, and AI can help with that, too, but if you lean on it too much, you'll get a mess.

    And of course, getting to the point where you can write a good foundation has always been the bulk of the work. I don't see that changing anytime soon.

  • Ok, let's say the 20 devs claim is false [1]. What if it's 2? I'd still learn and use the tech. Wouldn't you?

    [1] I actually think it might be true for certain kinds of jobs.

    • It's not 20 and it's not 2. It's not a person. It's a tool. It can make a person 100x more effective at certain specific things. It can make them 50% less effective at other things. I think, for most people and most things, it might be like a 25% performance boost, amortized over all (impactful) projects and time, but nobody can hope to quantify that with any degree of credibility yet.

    • Jevon's Paradox: more software will be produced, rather than fewer software engineers being employed.

  • I'd be willing to give you access to the experiment I mentioned in a separate reply (have a github repo), as far as the output that you can get for a complex app buildout.

    Will admit It's not great (probably not even good) but it definitely has throughput despite my absolute lack of caring that much [0]. Once I get past a certain stage I am thinking of doing an A-B test where I take an earlier commit and try again while paying more attention... (But I at least want to get where there is a full suite of UOW cases before I do that, for comparison's sake.)

    > Those twenty engineers must not have produced much.

    I've been considered a 'very fast' engineer at most shops (e.x. at multiple shops, stories assigned to me would have a <1 multiplier for points[1])

    20 is a bit bloated, unless we are talking about WITCH tier. I definitely can get done in 2-3 hours what could take me a day. I say it that way because at best it's 1-2 hours but other times it's longer, some folks remember the 'best' rather than median.

    [0] - It started as 'prompt only', although after a certain point I did start being more aggressive with personal edits.

    [1] - IDK why they did it that way instead of capacity, OTOH that saved me when it came to being assigned Manual Testing stories...

    • > Will admit It's not great (probably not even good) but it definitely has throughput

      Throughput without being good will just lead to more work down the line to correct the badness.

      It's like losing money on every sale but making up for it with volume.

    • > Will admit It's not great (probably not even good)

      You lost me here. Come back when you're proud of it.

> Coding AIs design software better than me, review code better than me, find hard-to-find bugs better than me, plan long-running projects better than me, make decisions based on research, literature, and also the state of our projects better than me.

That is just not true, assuming you have a modicum of competence (which I assume you do). AIs suck at all these tasks; they are not even as good as an inexperienced human.

  • For all we know, you both could comparing using a Nokia 3310 and a workstation PC based on the hardware, but you both just say "this computer is better than that computer".

    There are a ton of models out there, ran in a ton of different ways, that can be used in different ways with different harnesses, and people use different workflows. There is just so many variables involved, that I don't think it's neither fair nor accurate for anyone to claim "This is obviously better" or "This is obviously impossible".

    I've been in situations where I hit my head against some hard to find bug for days, then I put "AI" (but what? No one knows) to it and it solves it in 20 minutes. I've also asked "AI" to do trivial work that it still somehow fucked up, even if I could probably have asked a non-programmer friend to do it and they'd be able to.

    The variance is great, and the fact that system/developer/user prompts matter a lot for what the responses you get, makes it even harder to fairly compare things like this without having the actual chat logs in front of you.

    • > The variance is great

      this strikes me as a very important thing to reflect on. when the automobile was invented, was the apparent benefit so incredibly variable?

      2 replies →

  • LLMs generate the most likely code given the problem they're presented and everything they've been trained on, they don't actually understand how (or even if) it works. I only ever get away with that when I'm writing a parser.

    • > they don't actually understand how

      but if it empirically works, does it matter if the "intelligence" doesn't "understand" it?

      Does a chess engine "understand" the moves it makes?

      8 replies →

  • Depends on how he defined "better". If he uses the word "better" to mean "good enough to not fail immediately, and done in 1/10th of the time", then he's correct.

I was a chef in Michelin-starred restaurants for 11 years. One of my favorite positions was washing dishes. The goal was always to keep the machine running on its 5-minute cycle. It was about getting the dishes into racks, rinsing them, and having them ready and waiting for the previous cycle to end—so you could push them into the machine immediately—then getting them dried and put away after the cycle, making sure the quality was there and no spot was missed. If the machine stopped, the goal was to get another batch into it, putting everything else on hold. Keeping the machine running was the only way to prevent dishes from piling up, which would end with the towers falling over and breaking plates. This work requires moving lightning fast with dexterity.

AI coding agents are analogous to the machine. My job is to get the prompts written, and to do quality control and housekeeping after it runs a cycle. Nonetheless, like all automation, humans are still needed... for now.

  • If it requires an expert engineer/dishwasher to keep the flow running perfectly, the human is the bottleneck in the process. This sounds a lot more like the past before AI to me. What AI does is just give you enough dishes that they don’t need to be washed at all during dinner service. Just let them pile up dirty or throw them away and get new dishes tomorrow it’s so immaterial to replace that washing them doesn’t always make sense. But if for some reason you do want to reuse them, then, it washes and dries them for you too. You just look over things at the end and make sure they pass your quality standards. If they left some muck on a plate or lipstick on a cup, just tell it not to let that happen again and it won’t. So even your QC work gets easier over time. The labor needed to deal with dirty dishes is drastically reduced in any case.

  • Humans are still needed, but they just got down-skilled.

    • > got down-skilled.

      who's to say that it's a down?

      Orchestrating and doing higher level strategic planning, such that the sub-tasks can be AI produced, is a skill that might be higher than programming.

  • > humans are still needed... for now

    "AI" doesn't have a clue what to do on its own. Humans will always be in the loop, because they have goals, while the AI is designed to placate and not create.

    The amount of "AI" garbage I have to sift through to find one single gem is about the same or more work than if I had just coded it myself. Add to that the frustration of dealing with a compulsive liar, and it's just a fucking awful experience for anyone that actually can code.

> I’m basically just the conductor of all those processes.

a car moves faster than you, can last longer than you, and can carry much more than you. But somehow, people don't seem to be scared of cars displacing them(yet)? Perhaps autodriving would in the near future, but there still needs to be someone making decisions on how best to utilize that car - surely, it isn't deciding to go to destination A without someone telling them.

> I feel like I’m doing the work of an entire org that used to need twenty engineers.

and this is great. A combine harvester does the work of what used to be an entire village for a week in a day. More output for less people/resources expended means more wealth produced.

  • > a car moves faster than you, can last longer than you, and can carry much more than you. But somehow, people don't seem to be scared of cars displacing them(yet)?

    People whose life were based around using horses for transportation were very scared of cars replacing them though, and correctly so, because horses for transportation is something people do for leisure today, not necessity. I feel like that's a more apt analogy than comparing cars to any human.

    > More output for less people/resources expended means more wealth produced.

    This is true, but it probably also means that this "more wealth produced" will be more concentrated, because it's easier to convince one person using AI that you should have half of the wealth they produce, rather than convincing 100 people you should have half of what they produce. From where I'm standing, it seems to have the same effects (but not as widespread or impactful, yet) as industrialization, that induced that side-effect as well.

    • Analogies are not going to work. Bug it's just as likely that, in the worst case, we are stage coach drivers who have to use cars when we just really love the quiet slowness of horses.

  • And parent is scared of being made redundant by AI because they need their job to pay for their car, insurance, gas and repairs.

  • > a car moves faster than you, can last longer than you, and can carry much more than you. But somehow, people don't seem to be scared of cars displacing them(yet)?

    ???

    Cars replaced horses, not people.

    In this scenario you are the horse.

>I really really want this to be true. I want to be relevant

Think of yourself as a chef and LLMs as ready to eat meals or a recipe app. Can ready to eat meals OR recipe apps put a chef out of business?

I think I've been using AI wrong. I can't understand testimonies like this. Most times I try to use AI for a task, it is a shitshow, and I have to rewrite everything anyway.

  • I don’t know about right/wrong. You need to use the tools that make you productive. I personally find that in my work there are dozens of little scripts or helper functions that accelerate my work. However I usually don’t write them because I don’t have the time. AI can generate these little scripts very consistently. That accelerates my work. Perhaps just start simple.

    • Instead of generating, exporting or copy pasting just seems more reliable to me and also takes very little time.

      I think what matters most is just what you're working on. It's great for crud or working with public APIs with lots of examples.

      For everything else, AI has been a net loss for me.

  • Do you tell AI the patterns/tools/architecture you want? Telling agents to "build me XYZ, make it gud!" is likely to precede a mess, telling it to build a modular monolith using your library/tool list, your preferred folder structure, other patterns/algorithms you use, etc will end you up with something that might have some minor style issues or not be perfectly canonical, but will be approximately correct within a reasonable margin, or is within 1-2 turns of being so.

    You have to let go of the code looking exactly a certain way, but having code _work_ a certain way at a coarse level is doable and fairly easy.

    • Honestly, even this isn't really true anymore. With Opus 4.5 and 5.2 Codex in tools like Cursor, Claude Code, or Codex CLI, "just do the thing" is a viable strategy for a shockingly large category of tasks.

      2 replies →

    • >You have to let go of the code looking exactly a certain way, but having code _work_ a certain way at a coarse level is doable and fairly easy.

      So all that bullshit about "code smells" was nonsense.

      2 replies →

  • have you tried using $NEWEST_MODEL ?

    • It’s because depending on the person the newest model crossed the line into being useful for them personally. It’s not like a new version crosses the line for everyone. It happens gradually. Each version more and more people come into the fold.

      For me Claude code changed the game.

  • how much time/effort have you put in to educate yourself about how they work, what they excel at, what they suck at, what is your responsibility when you use them…? this effort is directly proportional to how well they will serve you

They do all those things you've mentioned more efficiently than most of us, but they fall woefully short as soon as novelty is required. Creativity is not in their repertoire. So if you're banging out the same type of thing over and over again, yes, they will make that work light and then scarce. But if you need to create something niche, something one-off, something new, they'll slip off the bleeding edge into the comfortable valley of the familiar at every step.

I choose to look at it as an opportunity to spend more time on the interesting problems, and work at a higher level. We used to worry about pointers and memory allocation. Now we will worry less and less about how the code is written and more about the result it built.

  • Take food for example. We don't eat food made by computers even though they're capable of making it from start to finish.

    Sure we eat carrots probably assisted by machines, but we are not eating dishes like protein bars all day every day.

    Our food is still better enjoyed when made by a chef.

    Software engineering will be the same. No one will want to use software made by a machine all day every day. There are differences in the execution and implementation.

    No one will want to read books entirely dreamed up by AI. Subtle parts of the books make us feel something only a human could have put right there right then.

    No one will want to see movies entirely made by AI.

    The list goes on.

    But you might say "software is different". Yes but no, in the abundance of choice, when there will be a ton of choice for a type of software due to the productivity increase, choice will become more prominent and the human driven software will win.

    Even today we pick the best terminal emulation software because we notice the difference between exquisitely crafted and bloated cruft.

    • You should look at other engineering disciplines. How many highway over passes have unique “chef quality” designs? Very few. Most engineering is commodity replications of existing designs. The exact same thing applies to software engineering. Most of us engineers are replicating designs that came earlier. LLMs are good at generating the rote designs that make up the bulk of software by volume. Who benefit from an artisanal REST interface? The best practices were codified over a decade ago.

      3 replies →

    • Is your argument that we only want things that are hand-crafted by humans?

      There are lots of things like perfectly machined nails, tools, etc. that are much better done by machines. Why couldn't software be one of those?

  • > So if you're banging out the same type of thing over and over again, yes, they will make that work light and then scarce.

    The same thing over and over again should be a SaaS, some internal tool, or a plugin. Computers are good at doing the same thing over and over again and that's what we've been using them for

    > But if you need to create something niche, something one-off, something new, they'll slip off the bleeding edge into the comfortable valley of the familiar at every step.

    Even if the high level description of a task may be similar to another, there's always something different in the implementation. A sports car and a sedan have roughly the same components, but they're not engineered the same.

    > We used to worry about pointers and memory allocation.

    Some still do. It's not in every case you will have a system that handle allocations and a garbage collector. And even in those, you will see memory leaks.

    > Now we will worry less and less about how the code is written and more about the result it built.

    Wasn't that Dreamweaver?

  • I think your image of LLMs is a bit outdated. Claude Code with well-configured agents will get entirely novel stuff done pretty well, and that’s only going to get better over time.

    I wouldn’t want to bet my career on that anyway.

It's definitely scary in a way.

However I'm still finding a trend even in my org; better non-AI developers tend to be better at using AI to develop.

AI still forgets requirements.

I'm currently running an experiment where I try to get a design and then execute on an enterprise 'SAAS-replacement' application [0].

AI can spit forth a completely convincing looking overall project plan [1] that has gaps if anyone, even the AI itself, tries to execute on the plan; this is where a proper, experienced developer can step in at the right steps to help out.

IDK if that's the right way to venture into the brave new world, but I am at least doing my best to be at a forefront of how my org is using the tech.

[0] - I figured it was a good exercise for testing limits of both my skills prompting and the AI's capability. I do not expect success.

  • AI does not forget requirements when you use a spec driven AI tool like Kiro

My experience with these tools is far and away no where close to this.

If you're really able to do the work of a 20 man org on your own, start a business.

As of today NONE of the known AI codebots can solve correctly ANY of the 50+ programming exercises we use to interview fresh grads or summer interns. NONE! Not even level 1 problems that can be solved in fewer than 20 lines of code with a bit of middle school math.

  • After 25+ years in this field, having interviewed ~100 people for both my startup and other companies, I'm having a hard time believing this. You're either in an extremely niche field (such as to make your statement irrelevant to 99.9% of the industry), or it's hyperbole, or straight up bs.

    Interviewing is an art, and IME "gotcha" types of questions never work. You want to search for real-world capabilities, and like it or not the questions need to match those expectations. If you're hiring summer interns and the SotA models can't solve those questions, then you're doing something wrong. Sorry, but having used these tools for the past three years this is extremely ahrd to believe.

    I of course understand if you can't, but sharing even one of those questions would be nice.

  • I promise you that I can show you how to reliably solve any of them using any of the latest OpenAI models. Email me if you want proof; josh.d.griffith at gmail

    • I'd watch that show ideally with few base rules though, e.g.

      - the problems to solve must NOT be part of the training set

      - the person using the tool (e.g. OpenAI, Claude, DevStral, DeepSeek, etc) must NOT be able to solve problems alone

      as I believe otherwise the 1st is "just" search and the 2nd is basically offloading the actual problem solving to the user.

      2 replies →

That's kind of the point of the article, though.

Sure LLMs can churn out code, and they sort of work for developers who already understand code and design, but what happens when that junior dev with no hard experience builds their years of experience with LLMs?

Over time those who actually understand what the LLMs are doing and how to correct the output are replaced by developers who've never learned the hard lessons of writing code line by line. The ability to reason about code gets lost.

This points to the hard problem that the article highlights. The hard problem of software is actually knowing how to write it, which usually takes years, sometimes up to a decade of real experience.

Any idiot can churn out code that doesn't work. But working, effective software takes a lot of skill that LLMs will be stripping people of. Leaving a market there for people who have actually put the time in and understand software.

I am sorry to say you are not a good programmer.

I mean, AIs can drop something fast the same way you cannot beat a computer at adding or multiplying.

After that, you find mistakes, false positives, code that does not work fully, and the worse part is the last one: code that does not work fully but also, as a consequence, that you do NOT understand yet.

That is where your time shrinks: now you need to review it.

Also, they do not design systems better. Maybe partial pieces. Give them something complex and they will hallucinate worse solutions than what you already know if you have, let us say, over 10 years of experience programming in a language (or mabye 5).

Now multiply this unreliability problem as the code you "AI-generate" grows.

Now you have a system you do not know if it is reliable and that you do not understand to modify. Congrats...

I use AI moderately for the tasks is good at: generate some scripts, give me this small typical function amd I review it.

Review my code: I will discard part of your mistakes and hallucinations as a person that knows well the language and will find maybe a few valuable things.

Also, when reviewing and found problems in my code I saw that the LLMs really need to hallucinate errors that do not exist to justify their help. This is just something LLMs seem to not be accurate at.

Also, when problems go a bit more atypical or past a level of difficulty, it gets much more unreliable.

All in all: you are going to need humans. I do not know how many, I do not know how much they will improve. I just know that they are not reliable and this "generate-fast-unreliable vs now I do not know the codebase" is a fundamental obstacle that I think it is if not very difficult, impossible to workaround.

This is not how I think about it. Me and the coding assistant is better then me or the coding assistant separately.

For me its not about me or the coding assistant, its me and the coding assistant. But I'm also not a professional coder, i dont identify as a coder. I've been fiddling with programming my whole life, but never had it as title, I've more worked from product side or from stakeholder side, but always got more involved, as I could speak with the dev team.

This also makes it natural for me to work side-by-side with the coding assistant, compared maybe to pure coders, who are used to keeping the coding side to themselves.

>> Coding AIs design software better than me

Absolutely flat out not true.

I'm extremely pro-faster-keyboard, i use the faster keyboards in almost every opportunity i can, i've been amazed by debugging skills (in fairness, i've also been very disappointed many times), i've been bowled over by my faster keyboard's ability to whip out HTML UI's in record time, i've been genuinely impressed by my faster keyboard's ability to flag flaws in PRs i'm reviewing.

All this to say, i see lots of value in faster keyboard's but add all the prompts, skills and hooks you like, explain in as much detail as you like about modularisation, and still "agents" cannot design software as well as a human.

Whatever the underlying mechanism of an LLM (to call it a next token predictor is dismissively underselling its capabilities) it does not have a mechanism to decompose a problem into independently solvable pieces. While that remains true, and i've seen zero precursor of a coming change here - the state of the art today is equiv to having the agent employ a todo list - while this remains true, LLMs cannot design better than humans.

There are many simple CRUD line of business apps where they design well enough (well more accurately stated, the problem is small/simple enough) that it doesn't matter about this lack of design skill in LLMs or agents. But don't confuse that for being able to design software in the more general use case.

  • Exactly, for the thing that has been done in Github 10000x times over, LLMs are pretty awesome and they speed up your job significantly (it's arguable if you would be better off using some abstraction already built if that's the case).

    But try to do something novel and... they become nearly useless. Not like anything particularly difficult, just something that's so niche it's never been done before. It will most likely hallucinate some methods and call it a day.

    As a personal anecdote, I was doing some LTSpice simulations and tried to get Claude Sonnet to write a plot expression to convert reactance to apparent capacitance in an AC sweep. It hallucinated pretty much the entire thing, and got the equation wrong (assumed the source was unit intensity, while LTSpice models AC circuits with unit voltage. This surely is on the internet, but apparently has never been written alongside the need to convert an impedance to capacitance!).

> Coding AIs design software better than me, review code better than me, find hard-to-find bugs better than me, plan long-running projects better than me, make decisions based on research, literature, and also the state of our projects better than me.

They don't do any of that better than me; they do it poorer and faster, but well enough for most of the time.

I feel you, it's scary. But the possibilities we're presented with are incredible. I'm revisiting all these projects that I put aside because they were "too big" or "too much for a machine". It's quite exciting

Try have your engineers pick up some product work. Clients do NOT want to talk to bots.

There will be a need. Don't worry. Most people still haven't figured out how to properly read and interpret instructions. So they build things incorrectly - with or without AI

Seriously. The bar is that low. When people say "AI slop" I just chuckle because it's not "AI" it's everyone. That's the general state of the industry.

So all you have to do is stay engaged, ask questions, and understand the requirements. Know what it is you're building and you'll be fine.

>> From where I’m standing, it’s scary.

You are being fooled by randomness [1]

Not because the models are random, but because you are mistaking a massive combinatorial search over seen patterns for genuine reasoning. Taleb point was about confusing luck for skill. Dont confuse interpolation for understanding.

You can read a Rust book after years of Java, then go build software for an industry that did not exist when you started. Ask any LLM to write a driver for hardware that shipped last month, or model a regulatory framework that just passed... It will confidently hallucinate. You will figure it out. That is the difference between pattern matching and understanding.

[1] https://en.wikipedia.org/wiki/Fooled_by_Randomness

  • I've worked with a lot of interns, fresh outs from college, overseas lowest bidders, and mediocre engineers who gave up years ago. All over the course of a ~20 year career.

    Not once in all that time has anyone PRed and merged my completely unrelated and unfinished branch into main. Except a few weeks ago. By someone who was using the LLM to make PRs.

    He didn't understand when I asked him about it and was baffled as to how it happened.

    Really annoying, but I got significantly less concerned about the future of human software engineering after that.

  • Have you used an LLM specifically trained for tool calling, in Claude Code, Cursor or Aider?

    They’re capable of looking up documentation, correcting their errors by compiling and running tests, and when coupled with a linter, hallucinations are a non issue.

    I don’t really think it’s possible to dismiss a model that’s been trained with reinforcement learning for both reasoning and tool usage as only doing pattern matching. They’re not at all the same beasts as the old style of LLMs based purely on next token prediction of massive scrapes of web data (with some fine tuning on Q&A pairs and RLHF to pick the best answers).

    • I'm using Claude code to help me learn Godot game programming.

      One interesting thing is that Claude will not tell me if I'm following the wrong path. It will just make the requested change to the best of its ability.

      For example a Tower Defence game I'm making I wanted to keep turret position state in an AStarGrid2D. It produced code to do this, but became harder and harder to follow as I went on. It's only after watching more tutorials I figured out I was asking for the wrong thing. (TileMapLayer is a much better choice)

      LLMs still suffer from Garbage in Garbage out.

      5 replies →

    • Ask a model to

      "Write a chess engine where pawns move backward and kings can jump like nights"

      It will keep slipping back into real chess rules. It learned chess, it did not understand the concept of "rules"

      Or

      Ask it to reverse a made up word like

      "Reverse the string 'glorbix'"

      It will get it wrong on the first try. You would not fail.

      Or even better ask it to...

      "Use the dxastgraphx library to build a DAG scheduler."

      dxastgraphx is a non existing library...

      Marvel at the results...tried in both Claude and ChatGPT....

      6 replies →

  • Why would you expect an LLM or even a human to succeed in these cases? “Write a piece of code for a specification that you can’t possibly know about?” That’s why you have to do context engineering, just like you’d provide a reference to a new document to an engineer writing code.

  • This is exactly what happened to me: novel or uncommon = hallucinate or invent wrong.

    It is ok for getting snippets for example and saying (I did it). Please make this MVVM style. It is not perfect, but saves time.

    For very broad or novel reasoning, as of today... forget it.

Perfect economic substitution in coding doesn't happen for a long time. Meanwhile, AI appears as an amplifier to the human and vice versa. That the work will change is scary, but the change also opens up possibilities, many of them now hard to imagine.

Stop freaking out. Seriously. You're afraid of something completely ridiculous.

It is certainly more eloquent than you regarding software architecture (which was a scam all along, but conversation for another time). It will find SOME bugs better than you, that's a given.

Review code better than you? Seriously? What you're using and what you consider code review? Assume I could identify one change broke production and you reviewed the latest commit. I am pinging you and you better answer. Ok, Claude broke production, now what? Can you begin to understand the difference between you and the generative technology? When you hop on the call, you will explain to me with a great deal of details what you know about the system you built, and explain decision making and changes over time. You'll tell about what worked and what didn't. You will tell about the risks, behavior and expectations. About where the code runs, it's dependencies, users, usage patterns, load, CPU usage and memory footprint, you could probably tell what's happening without looking at logs but at metrics. With Claude I get: you're absolutely right! You asked about what it WAS, but I told you about what it WASN'T! MY BAD.

Knowledge requires a soul to experience and this is why you're paid.

  • We use code rabbit and it's better than practically any human I've worked with at a number of code review tasks, such as finding vulnerabilities, highlighting configuration issues, bad practices, etc. It's not the greatest at "does this make sense here" type questions, but I'd be the one answering those questions anyway.

    Yeah, maybe the people I've worked with suck at code reviews, but that's pretty normal.

    Not to say your answer is wrong. I think the gist is accurate. But I think tooling will get better at answering exactly the kind of questions you bring up.

    Also, someone has to be responsible. I don't think the industry can continue with this BS "AI broke it." Our jobs might devolve into something more akin to a SDET role and writing the "last mile" of novel code the AI can't produce accurately.

  • > Review code better than you? Seriously?

    Yes, seriously (not OP). Sometimes it's dumb as rocks, sometimes it's frighteningly astute.

    I'm not sure at which point of the technology sigmoid curve we find ourselves (2007 iPhone or 2017 iPhone?) but you're doing yourself a disservice to be so dismissive

    • Copilot reviews are enabled company wide and comments must be resolved manually. I wish I could be so dismissive lol I cannot, literally do not have the ability to be dismissive

> I’m basically just the conductor of all those processes.

Orchestrating harmony is no mean feat.

The AI is pretty scary if you think most of software engineering is about authoring individual methods and rubber ducking about colors of paint and brands of tools.

Once you learn that it's mostly about interacting with a customer (sometimes this is yourself), you will realize the AI is pretty awful at handling even the most basic tasks.

Following a product vision, selecting an appropriate architecture and eschewing 3rd party slop are examples of critical areas where these models are either fundamentally incapable or adversely aligned. I find I have to probe ChatGPT very hard to get it to offer a direct implementation of something like a SAML service provider. This isn't a particularly difficult thing to do in a language like C# with all of the built in XML libraries, but the LLM will constantly try to push you to use 3rd party and cloud shit throughout. If you don't have strong internal convictions (vision) about what you really want, it's going to take you for a ride.

One other thing to remember is that our economies are incredibly efficient. The statistical mean of all information in sight of the LLMs likely does not represent much of an arbitrage opportunity at scale. Everyone else has access to the same information. This also means that composing these systems in recursive or agentic styles means you aren't gaining anything. You cannot increase the information content of a system by simply creating another instance of the same system and having it argue with itself. There usually exists some simple prompt that makes a multi agent Rube Goldberg contraption look silly.

> I’m basically just the conductor of all those processes.

"Basically" and "just" are doing some heroic weight lifting here. Effectively conducting all of the things an LLM is good at still requires a lot of experience. Making the constraints live together in one happy place is the hard part. This is why some of us call it "engineering".

I have been using the most recent Claude, ChatGPT and Gemini models for coding for a bit more than a year, on a daily basis.

They are pretty good at writing code *after* I thoroughly described what to do, step by step. If you miss a small detail they get loose and the end result is a complete mess that takes hours to clean up. This still requires years of coding experience, planning ahead in head, you won't be able to spare that, or replace developers with LLMs. They are like autocomplete on steroids, that's pretty much it.

> Coding AIs design software better than me, review code better than me, find hard-to-find bugs better than me, plan long-running projects better than me, make decisions based on research, literature, and also the state of our projects better than me

ChatGPT, is that you?

His logic is off and his experience is irrelevant because i doesn’t encompass scale to have been exposed to an actual paradigm shifting event. Civilizations and entire technologies have been overturned so he can’t say it won’t happen this time.

What we do know is this. If AI keeps improving at the current rate it’s improving then it will eventually hit a point where we don’t need software engineers. That’s inevitable. The way for it to not happen is for this technology to hit an impenetrable wall.

This wave of AI came so fast that there are still stubborn people who think it’s a stochastic parrot. They missed the boat.

Yeah, it makes me wonder whether I should start learning to be a carpenter or something. Those who either support AI or thinks "it's all bullshit" cite a lack of evidence for humans truly being replaced in the engineering process, but that's just the thing; the unprecedented levels of uncertainty make it very difficult to invest one's self in the present, intellectually and emotionally. With the current state of things, I don't think it's silly to wonder "what's the point" if another 5 years of this trajectory is going to mean not getting hired as a software dev again unless you have a PhD and want to work for an AI company.

What doesn't help is that the current state of AI adoption is heavily top-down. What I mean is the buy-in is coming from the leadership class and the shareholder class, both of whom have the incentive to remove the necessary evil of human beings from their processes. Ironically, these classes are perhaps the least qualified to decide whether generative AI can replace swathes of their workforce without serious unforeseen consequences. To make matters worse, those consequences might be as distal as too many NEETs in the system such that no one can afford to buy their crap anymore; good luck getting anyone focused on making it to the next financial quarter to give a shit about that. And that's really all that matters at the end of the day; what leadership believes, whether or not they are in touch with reality.

Where the hell was all this fear when the push for open source everything got fully underway? When entire websites were being spawned and scaffolded with just a couple lines of code? Do we not remember all those impressive tech demos of developers doing massive complex thing with "just one line of code"? How did we not just write software for every kind of software problem that could exist by now?

How has free code, developed by humans, become more available than ever and yet somehow we have had to employ more and more developers? Why didn't we trend toward less developers?

It just doesn't make sense. AI is nothing but a snippet generator, a static analyzer, a linter, a compiler, an LSP, a google search, a copy paste from stackoverflow, all technologies we've had for a long time, all things developers used to have to go without at some point in history.

I don't have the answers.