I think this article is pretty spot on — it articulates something I’ve come to appreciate about LLM-assisted coding over the past few months.
I started out very sceptical. When Claude Code landed, I got completely seduced — borderline addicted, slot machine-style — by what initially felt like a superpower. Then I actually read the code. It was shockingly bad. I swung back hard to my earlier scepticism, probably even more entrenched than before.
Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.
It’s still absolute rubbish if you just let it run wild, which is why I think “vibe coding” is basically just “vibe debt” — because it just doesn’t do what most (possibly uninformed) people think it does.
But if you treat it as a collaborator — more like an idiot savant with a massive brain but no instinct or nous — or better yet, as a mech suit [0] that needs firm control — then something interesting happens.
I’m now at a point where working with Claude Code is not just productive, it actually produces pretty good code, with the right guidance. I’ve got tests, lots of them. I’ve also developed a way of getting Claude to document intent as we go, which helps me, any future human reader, and, crucially, the model itself when revisiting old code.
What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.
Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”? There’s no counterfactual, so who knows? But change was inevitable.
What’s clear is this tech is here now, and complaining about it feels a bit like mourning the loss of punch cards when terminals showed up.
One of the things I think is going on here is a sort of stone soup effect. [1]
Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.
But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.
Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.
So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/
> Core to Ptacek's point is that everything has changed in the last 6 months.
This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.
I wrote this comment elsewhere: https://news.ycombinator.com/item?id=44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.
Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.
Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.
Synthesis is hard.
Understanding causality is even harder.
Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.
This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)
Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.
Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.
"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."
You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.
I wish i could impress this upon more people.
A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.
This is maybe even true!
as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.
Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".
As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.
So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.
and like i said, this is not just true in CS, but in lots of fields.
That is arguably quite bad, but also seems unlikely to change.
The better I am at solving a problem, the less I use AI assistants. I use them if I try a new language or framework.
Busy code I need to generate is difficult to do with AI too. Because then you need to formalize the necessary context for an AI assistant, which is exhausting with an unsure result. So perhaps it is just simpler to write it yourself quickly.
I understand comments being negative, because there is so much AI hype without having to many practical applications yet. Or at least good practical applications. Some of that hype is justified, some of it is not. I enjoyed the image/video/audio synthesis hype more tbh.
Test cases are quite helpful and comments are decent too. But often prompting is more complex than programming something. And you can never be sure if any answer is usable.
> But often prompting is more complex than programming something.
I'd challenge this one; is it more complex, or is all the thinking and decision making concentrated into a single sentence or paragraph? For me, programming something is taking a big high over problem and breaking it down into smaller and smaller sections until it's a line of code; the lines of code are relatively low effort / cost little brain power. But in my experience, the problem itself and its nuances are only defined once all code is written. If you have to prompt an AI to write it, you need to define the problem beforehand.
It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source. Techniques like TDD have shifted more of the problem definition forwards as you have to think about your desired outcomes before writing code, but I'm pretty sure (I have no figures) it's only a minority of developers that have the self-discipline to practice test-driven development consistently.
(disclaimer: I don't use AI much, and my employer isn't yet looking into or paying for agentic coding, so it's chat style or inline code suggestions)
I agree with your points but I'm also reminded of one my bigger learnings as a manager - the stuff I'm best at is the hardest, but most important, to delegate.
Sure it was easier to do it myself. But putting in the time to train, give context, develop guardrails, learn how to monitor etc ultimately taught me the skills needed to delegate effectively and multiply the teams output massively as we added people.
It's early days but I'm getting the same feeling with LLMs. It's as exhausting as training an overconfident but talented intern, but if you can work through it and somehow get it to produce something as good as you would do yourself, it's a massive multiplier.
> But often prompting is more complex than programming something.
It may be more complex, but it is in my opinion better long term. We need to get good at communicating with AIs to get results that we want. Forgive me assuming that you probably didn't use these assistants long enough to get good at using them. I'm web developer for 20 years already and AI tools are multiplying my output even in problems I'm very good at. And they are getting better very quickly.
I totally agree. The ”hard to control mech suit” is an excellent analogy.
When it works it’s brilliant.
There is a threshold point as part of the learning curve where you realize you are in a pile of spaghetti code and think it actually saves no time to use LLM assistant.
But then you learn to avoid the bad parts - thus they don’t take your time anymore - and the good parts start paying back in heaps of the time spent learning.
The issue is we’re too early in the process to even have a solid education program for using LLMs. I use them all the time and continue to struggle finding an approach that works well. It’s easy to use them for documentation look up. Or filling in boilerplate. Sometimes they nail a transformation/translation task, other times they’re more trouble than they’re worth.
We need to understand what kind of guard rails to put these models on for optimal results.
also, the agents are actually pretty good at cleaning up spaghetti if you do it one module at a time, use unit tests. And some of the models are smart enough to suggest good organization schemes!
For what it's worth: I'm not dismissive of the idea that these things could be ruinous for the interests of the profession. I don't automatically assume that making applications drastically easier to produce is just going to make way for more opportunities.
I just don't think the interest of the profession control. The travel agents had interests too!
For a long time there has been back chatter on how to turn programming into a more professional field, more like actual engineering where when something goes wrong actual people and companies start to take security seriously, and get held accountable for their mistakes, and start to actually earn their high salaries.
Getting AI to hallucinate its way into secure and better quality code seems like the antithesis of this. Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars - instead of attempting to take on our more critical and creative outputs - image generation, movie generation, book writing and even website building.
Soon as the world realized they don't need a website and can just have FB/Twitter page, a huge percentage of freelance web development gigs just vanished. We have to get real about what's about to happen. The app economy filled the gap, and the only optimistic case is the AI app industry is what's going to fill the gap going forward. I just don't know about that. There's a certain end-game vibes I'm getting because we're talking about self-building and self-healing software. More so, a person can ask the AI to role play anything, even an app.
> Employment of travel agents is projected to grow 3 percent from 2023 to 2033, about as fast as the average for all occupations.
The last year there is data for claims 68,800 people employed as travel agents in the US. It's not a boom industry by any means, but it doesn't appear they experienced the apocalypse that Hacker News believes they did, either.
I don't know how to easily find historical data, unfortunately. BLS publishes the excel sheets, but pulling out the specific category would have to be done manually as far as I can tell. There's this, I guess: https://www.travelagewest.com/Industry-Insight/Business-Feat...
It appears at least that what happened is, though it may be easier than ever to plan your own travel, there are so many more people traveling these days than in the past that the demand for travel agents hasn't crashed.
Let's be real. Software engineers are skeptical right now not because they believe robots are better than them. Quite the opposite. The suits will replace software engineers despite its mediocrity.
It was just 2 weeks ago when the utter incompetence of these robots were in full public display [1]. But none of that will matter to greedy corporate executives, who will prioritize short-term cost savings. They will hop from company to company, personally reaping the benefits while undermining essential systems that users and society rely on with robot slop. That's part of the reason why the C-suites are overhyping the technology. After all, no rich executive has faced consequences for behaving this way.
It's not just software engineering jobs that will take a hit. Society as a whole will suffer from the greedy recklessness.
The reason I remain in the "skeptical" camp is because I am experiencing the same thing you are - I keep oscillating between being impressed, then disappointed.
Ultimately the thing that impresses me is that LLMs have replaced google search. The thing that disappoints me is that their code is often convincing but wrong.
Coming from a hard-engineering background, anything that is unreliable is categorized as bad. If you come from the move-fast-break-things world of tech, then your tolerance for mistakes is probably a lot higher.
I have been using Windsurf for a few months and ChatGPT for a couple of years. I don't feel Windsurf is a massive game changer personally. It is good if you are very tired or working in a new area (also good for exploring UI ideas as the feedback loop is tight), but still not a real game changer over ChatGPT. Waiting around for it to do its thing ("we've encountered at error - no credits used") is boring and flow destroying. Of you know exactly what you are doing the productivity is probably 0.5 vs just typing the code in yourself. Sorry, I'm not going to bang around in Windsurf all day just to help with the training so that "v2" can be better. They should be paying me for this realistically.
Of course, in aggregate AI makes me capable in a far broader set of problem domains. It would be tough to live without it at this stage, but needs to be used for what it is actually good at, not what we hope it will be good at.
> What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.
I tried the latest Claude for a very complex wrapper around the AWS Price APIs who are not easy to work with. Down a 2,000 line of code file, I found Claude faking some API returns by creating hard coded values. A pattern I have seen professional developers being caught on while under pressure to deliver.
This will be a boon to the human skilled developers, that will be hired at $900 dollars an hour to fix bugs of a subtlety never seen before.
More or less this. Maybe a job opportunity, but many decision makers won't see the real problem until they get hit by that AWS bill. Ironic, if the business won't hire you because they went out of business?
The arguments seem to come down to tooling. The article suggests that ChatGPT isn't a good way to interact with LLMs but I'm not so sure. If the greatest utility is "rubber ducking" and editing the code yourself is necessary then tools like Cursor go too far in a sense. In my own experience, Windsurf is good for true vibe coding where I just want to explore an idea and throw away the code. It is still annoying though as it takes so long to do things - ruining any kind of flow state you may have. I am conversing with ChatGPT directly much more often.
I haven't tried Claud code yet however. Maybe that approach is more on point.
Totally agree with "vibe debt". Letting an LLM off-leash without checks is a fast track to spaghetti. But with tests, clear prompts, and some light editing, I’ve shipped a lot of real stuff faster than I could have otherwise.
I generally agree with the attitude of the original post as well. But I stick one one point. It definitely doesn't cost 20 dollars a month, cursor.ai might and I don't know how good it is, but claude code costs hundreds of dollars a month, still cheaper than a junior dev though.
> Did Photoshop kill graphic artists? Did film kill theatre?
To a first approximation, the answer to both of these is "yes".
There is still a lot of graphic design work out there (though generative AI will be sucking the marrow out of it soon), but far less than there used to be before the desktop publishing revolution. And the kind of work changed. If "graphic design" to you meant sitting at a drafting table with pencil and paper, those jobs largely evaporated. If that was a kind of work that was rewarding and meaningful to you, that option was removed for you.
Theatre even more so. Yes, there are still some theatres. But the number of people who get to work in theatrical acting, set design, costuming, etc. is a tiny tiny fraction of what it used to be. And those people are barely scraping together a living, and usually working side jobs just to pay their bills.
> it feels a bit like mourning the loss of punch cards when terminals showed up.
I think people deserve the right to mourn the loss of experiences that are meaningful and enjoyable to them, even if those experiences turn out to no longer be maximally economically efficient according to the Great Capitalistic Moral Code.
Does it mean that we should preserve antiquated jobs and suffer the societal effects of inefficiency without bound? Probably not.
But we should remember that the ultimate goal of the economic system is to enable people to live with meaning and dignity. Efficiency is a means to that end.
But the number of people who get to work in theatrical acting, set design, costuming
I think this ends up being recency bias and terminology hairsplitting, in the end. The number of people working in theatre mask design went to nearly zero quite a while back but we still call the stuff in the centuries after that 'theatre' and 'acting'.
In actual fact, photoshop did kill graphic arts. There was an entire industry filled with people who had highly-developed skillsets that suddenly became obsolete. Painters for example. Before photoshop, I had to go out of house to get artwork done; now I just do it myself.
This, as the article makes clear, is a concern I am alert and receptive to. Ban production of anything visual from an LLM; I'll vote for it. Just make sure they can still generate Mermaid charts and Graphviz diagrams, so they still apply to developers.
Well, this is only partially true. My optimistic take is that it will redefine the field. There is still a future for resourceful, attentive, and prepared graphic artists.
AI didn't kill creativity nor intuition. It much rather lack's those things completely. Artists can make use of AI but they can't make themselves obsolete just yet.
I'm an engineer through and through. I can ask an LLM to generate images just fine, but for a given target audience for a certain purpose? I would have no clue. None what so ever. Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start. I have absolutely no understanding of the advertisement domain, and I don't know what tired parents find visually pleasing, or what they would "vibe" with.
My feeble attempts would be absolute trash compared to a professional artist who uses AI to express their vision. The artist would be able to prompt so much more effectively and correct the things that they know from experience will not work.
It's the exact same as with coding with an AI - it will be trash unless you understand the hows and the whys.
I agree with the potential of AI. I use it daily for coding and other tasks. However, there are two fundamental issues that make this different from the Photoshop comparison.
The models are trained primarily on copyrighted material and code written by the very professionals who now must "upskill" to remain relevant. This raises complex questions about compensation and ownership that didn't exist with traditional tools. Even if current laws permit it, the ethical implications are different from Photoshop-like tools.
Previous innovations created new mediums and opportunities. Photoshop didn't replace artists, because it enabled new art forms. Film reduced theater jobs but created an entirely new industry where skills could mostly transfer. Manufacturing automation made products like cars accessible to everyone.
AI is fundamentally different. It's designed to produce identical output to human workers, just more cheaply and/or faster. Instead of creating new possibilities, it's primarily focused on substitution. Say AI could eliminate 20% of coding jobs and reduce wages by 30%:
* Unlike previous innovations, this won't make software more accessible
* Software already scales essentially for free (build once, used by many)
* Most consumer software is already free (ad-supported)
The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.
This isn't an argument against progress, but we should be clear-eyed about how this transition differs from historical parallels, and why it might not repeat the same historical outcomes. I'm not claiming this will be the case, but that you can see some pretty significant differences for why you might be skeptical that the same creation of new jobs, or improvement to human lifestyle/capabilities will emerge as with say Film or Photoshop.
AI can also be used to achieve things we could not do without, that's the good use of AI, things like Cancer detection, self-driving cars, and so on. I'm speaking specifically of the use of AI to automate and reduce the cost/speed of white collar work like software development.
For me this is the "issue" I have with AI. Unlike say the internet, mobile and other tech revolutions where I could see new use cases or existing use case optimisation spring up all the time (new apps, new ways of interacting, more efficient than physical systems, etc) AI seems to be focused more on efficiency/substitution of labour than pushing the frontier on "quality of life". Maybe this will change but the buzz is around job replacement atm.
Its why it is impacting so many people, but also having very small changes to everyday "quality of life" kind of metrics (e.g. ability to eat, communicate, live somewhere, etc). It arguably is more about enabling greater inequality and gatekeeping of wealth to capital - where intelligence and merit matters less in the future world. For most people its hard to see where the positives are for them long term in this story; most everyday folks don't believe the utopia story is in anyway probable.
> The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.
This is the thing that worries me the most about AI.
The author's ramblings dovetails with this a bit in their "but the craft" section. They vaguely attack the idea of code-golfing and focusing on coding for the craft as essentially incompatible with the corporate model of programming work. And perhaps they're right. If they are, though, this AI wave/hype being mostly about process-streamlining and such seems to be a distillation of that fact.
Maybe it's like automation that makes webdev accessible to anyone. You take a week long AI coaching course and talk to an AI and let it throw together a website in an hour, then you self host it.
The key is that manual coding for a normal task takes a one/two weeks, where-as if you configure all your prompts/agents correctly you could do it in a couple of hours. As you highlighted, it brings many new issues (code quality, lack of tests, tech debt) and you need to carefully create prompts and review the code to tackle those. But in the end, you can save significant time.
I disagree. I think this notion comes from the idea that creating software is about coding. Automating/improving coding => you have software at the end.
This might be how one looks at it in the beginning, when having no experience or no idea about coding. With time one will realize it's more about creating the correct mental model of the problem at hand, rather than the activity of coding itself.
Once this realized, AI can't "save" you days of work, as coding is the least time consuming part of creating software.
Having plenty of initial discussion and distilling that into requirements documents aimed for modularized components which can all be easily tackled separately is key.
I’d add that Excel didn’t kill the engineering field. It made them more effective and maybe companies will need less of them. But it also means more startups and smaller shops can make use of an engineer. The change is hard and an equilibrium will be reached.
Desktop publication software killed many jobs. I worked for a publication where I had colleagues that used to typeset, place images, and use a camera to build pages by hand. That required a team of people. Once Quark Xpress and the like hit the scene, one person could do it all, faster.
In terms of illustration, the tools moved from pen and paper to Adobe Illustrator and Aldus / Macromedia Freehand. Which I'd argue was more of a sideways move. You still needed an illustrators skillset to use these tools.
The difference between what I just described and LLM image generation is the tooling changed to streamline an existing skillset. LLM's replace all of it. Just type something and here's your picture. No art / design skill necessary. Obviously, there's no guarantee that the LLM generated image will be any good. So, I'm not sure the Photoshop analogy works here.
> Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.
This is how I use it mostly. I also use it for boilerplate, like "What would a database model look like that handles the following" you never want it to do everything, though there are tools that can and will and they're impressive, but then when you have a true production issue, your inability to quickly respond will be a barrier.
That’s all great news that if you know how to use an LLM, it works wonders for you. But LLMs are changing so fast, can it really be sustainable for me to “learn” it only for it to change and go backwards the next month? (I am thinking about how terrible Google became.)
I’m learning live how to use these things better, and I haven’t seen practical guides like:
- Split things into small files, today’s model harnesses struggle with massive files
- Write lots of tests. When the language model messes up the code (it will), it can use the tests to climb out. Tests are the best way to communicate behavior.
- Write guides and documentation for complex tasks in complex codebases. Use a language model for the first pass if you’re too lazy. Useful for both humans and LLMs
It’s really: make your codebase welcoming for junior engineers
Or not. I watched Copilot's agent mode get stuck in a loop for most of an hour (to be fair, I was letting it continue to see how it handles this failure case) trying to make a test pass.
Yes! It needs and seems to want the human to be a deep collaborator. If you take that approach, it is actually a second senior developer you can work with. You need to push it, and explain the complexities in detail to get fuller rewards. And get it to document everything important it learns from each session's context. It wants to collaborate to make you a 10X coder, not to do your work for you while you laze. That is the biggest breakthrough I have found. They basically react like human brains, with the same kind of motives. Their output can vary dramatically based on the input you provide.
i love your views and way to express it, spot on. i feel similar in some ways. i hated ai, loved ai, hated it again and love it again. i still feel the code i unusable for my main problems, but i realize better its my arrogance that causes it. i cant formulate solutions eloquently enough and blame the AI for bad code.
AI has helped me pick up my pencil and paper again and realize my flawed knowledge, skills, and even flawed approach to AI.
Now i instructed it to never give me code :). not because the code is bad, but my attempts to extract code from it are more based in laziness than efficiency. they are easy to confuse afterall ;(....
I have tons of fun learning with AI, exploring. going on adventures into new topics. Then when i want to really do something, i try to use it for the things i know i am bad at due to laziness, not lack of knowledge. the thing i fell for first...
it helps me explore a space, then i think or am inspired for some creation, and it helps me structure and plan. when i ask it from laziness to give me the code, it helps me overcome my laziness by explaining what i need to do to be able to see why asking for the code was the wrong approach in the first place.
now, that might be different for you. but i have learned i am not some god tier hacker from the spawl, so i realized i need to learn and get better. perhaps you are at the level you can ask it for code and it just works. hats off in that case ;k (i do hope you tested well!)
Photoshop etc are still just tools. They can’t beat us at what has always set us apart: thinking. LLM’s are the closest, and while they’re not close they’re directionally correct. They’re general purpose, not like chess engines. And they improve. It’s hard to predict a year out, never mind ten.
I agree, this article is basically what I've been thinking as I play with these things over time. They've gotten a ton better but the hot takes are still from 6-12 months ago.
One thing I wish he would have talked about though is maintenance. My only real qualm with my LLM agent buddy is the tendency to just keep adding code if the first pass didn't work. Eventually, it works, sometimes with my manual help. But the resulting code is harder to read and reason about, which makes maintenance and adding features or behavior changes harder. Until you're ready to just hand off the code to the LLM and not do your own changes to it, it's definitely something to keep in mind at minimum.
I am pretty sure this comment is also AI generated. Just a guess but so many em-dash is suspicious. And the overall structure of convincing feels uncanny.
If this is true, can you share your initial draft that you asked the AI to rewrite. Am I not right that the initial draft is more concise and better conveys your actual thought, even though it's not as much convincing.
I think also the key is - don't call it AI, because it's not. It's LLM assist query parsing and code generation. Semantically, if you call it AI, the public expects a cognitive equivalent to a human which this is not, and from what @tptacek describes, is not meant to be - the reasoning and other code bits to create agents and such seem to be developed specifically for code generation and programming assist and other tasks thereof. Viewed in that lens, the article is correct - it is by all means a major step forward.
The entire comment feels way too long, structured and convincing in a way that can only be written by an AI. I just hope that once the em-dashes are "fixed", we still be able to detect such text.
I fear for a future when human text is sparse, even here at HN. It is depressing to see such a comment take the top spot.
I use LLMs daily. From helping me write technical reports (not 100%, mostly making things sound better after I have a first draft) to mapping APIs (documentation, etc).
I can only imagine what this technology will be like in 10 years. But I do know that it's not going anywhere and it's best to get familiar with it now.
I treat AI as my digital partner in pair programming. I've learned how to give it specific and well-defined tasks to do, and it gets it done. The narrower the scope and more specific the task then the more successful you'll have.
there’s a sweet spot in there, it’s not “as narrow as possible” - the most productive thing is to assign the largest possible tasks that are just short of the limit where the agents become stupid. this is hard to hit, and a moving target!
LLM's are self-limiting, rather than self-reinforcing, and that's the big reason why they're not the thing, both good or bad, that some people think they are.
"Garbage in, garbage out", is still the rule for LLM's. If you don't spend billions training them or if you let them feed on their own tail too much they produce nonsense. e.g. Some LLM's currently produce better general search results than google. This is mainly a product of many billions being spent on expert trainers for those LLM's, while google neglects (or actively enshitifies) their search algorithms shamefully. It's humans, not LLM's, producing these results. How good will LLM's be at search once the money has moved somewhere else and neglect sets in?
LLM's aren't going to take everyone's jobs and trigger a singularity precisely because they fall apart if they try to feed on their own output. They need human input at every stage. They are going to take some people's jobs and create new ones for others, although it will probably be more of the former than the latter, or billionaires wouldn't be betting on them.
This is my experience in general. People seem to be impressed by the LLM output until they actually comprehend it.
The fastest way to have someone break out of this illusion is tell them to chat with the LLM about their own expertise. They will quickly start to notice errors in the output.
You know who does that also? Humans. I read shitty, broken, amazing, useful code every day, but you don’t see my complaining online that people who earn 100-200k salary don’t produce ideal output right away. And believe me, I spend way more time fixing their shit than LLMs.
If I can reduce this even by 10% for 20 dollars it’s a bargain.
Really? I feel like the article pointedly skirted my biggest complaint.
> ## but the code is shitty, like that of a junior developer
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.
I hate pair-programming with junior devs. I hate it. I want to take the keyboard away from them and do it all myself, but I can't, or they'll never learn.
Why would I want a tool that replicates that experience without the benefit of actually helping anyone?
This article does not touch on the thing which worries me the most with respect to LLMs: the dependence.
Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work. Whoever controls that system can arbitrarily raise the prices, subtly manipulate the outputs, store and do anything they want with the inputs, or even suddenly cease to operate. And since, according to this article, only the latest and greatest LLM is acceptable (and I've seen that exact same argument six months ago), running locally is not viable (I've seen, in a recent discussion, someone mention a home server with something like 384G of RAM just to run one LLM locally).
To those of us who like Free Software because of the freedom it gives us, this is a severe regression.
Yes, and it's even worse: if you think LLMs may possibly make the world a worse place, you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better.
I think that’s a bit of a leap; if you think LLMs make the world a worse place, there are many actions that you might take or not take to try to address that.
It's also why local models, even if less powerful, are so important. The gap between "state of the art" and "good enough for a lot of workflows" is narrowing fast
Yeah I am very excited for local models to get good enough to be properly useful. I’m a bit of an AI skeptic I’ll admit, but I’m much more of a SV venture-backed company skeptic. The idea of being heavily reliant on such a company, plus needing to be online, plus needing to pay money just to get some coding done is pretty unpalatable to me.
You can get 90%+ of the way there with a tiny “coder” LLM running on the Ollama backend with an extension like RooCode and a ton of MCP tools.
In fact, MCP is so ground breaking that I consider it to be the actual meat and potatoes of coding AIs. Large models are too monolithic, and knowledge is forever changing. Better just to use a small 14b model (or even 8b in some cases!) with some MCP search tools, a good knowledge graph for memory, and a decent front end for everything. Let it teach itself based on the current context.
And all of that can run on an off the shelf $1k gaming computer from Costco. It’ll be super slow compared to a cloud system (like HDD vs SSD levels of slowness), but it will run in the first place and you’ll get *something* out of it.
This is why I run a set of rackmount servers at home, that have the media and apps that I want to consume. If my ISP bites the dust tomorrow, I've literally got years worth of music, books, tv, movies, etc. Hell, I even have a bunch of models on ollama, and an offline copy of wikipedia running (minus media, obv) via kiwix.
It's not off-grid, but that's the eventual dream/ goal.
* Not even counting cellular data carriers, I have a choice of at least five ISPs in my area. And if things get really bad, I can go down to my local library to politely encamp myself and use their WiFi.
* I've personally no need for a cloud provider, but I've spent a lot of time working on cloud-agnostic stuff. All the major cloud providers (and many of the minors) provide compute, storage (whether block, object, or relational), and network ingress and egress. As long as you don't deliberately tie yourself to the vendor-specific stuff, you're free to choose among all available providers.
I don't feel like being dependent on LLM coding tools is much of an issue, you can very easily switch between different vendors. And I hope that open weight models will be "good enough" until we get a monopoly.
In any case, even if you are afraid of getting too dependent on AI tools, I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months
> I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months
This actually to me implies the opposite of what you’re saying here. Why bother relearning the state of the art every few months, versus waiting for things to stabilize on a set of easy-to-use tools?
We will have the equivalent of Claude Sonnet 4 in a local LLM that can run well on a modern Mac w/ 36+ GB of ram in a year or two. Maybe faster. The local/open models are developing very fast in terms of quantization and how well they can run on consumer hardware.
Folks that are local LLMs everyday now will probably say you can basically emulate at least Sonnet 3.7 for coding if you have an real AI workstation. Which may be true, but the time and effort and cost involved is substantial.
Memory scaling has all but stopped. Current RAM cells are made up of just 40,000 or so electrons (that's when it's first stored. It degrades from there until refreshed). Going smaller is almost impossible due to physics, noise, and the problem of needing to amplify that tiny charge to something usable.
For the past few years, we've been "getting smaller" by getting deeper. The diameter of the cell shrinks, but the depth of the cell goes up. As you can imagine, that doesn't scale very well. Cutting the cylinder diameter in half doubles the depth of the cylinder for the same volume.
If you try to put the cells closer together, you start to get quantum tunneling where electrons would disappear from one cell and appear in another cell altering charges in unexpected ways.
The times of massive memory shrinks are over. That means we have to reduce production costs and have more chips per computer or find a new kind of memory that is mass producible.
That will work until there has been a lot of infrastructure created to work with a particular player, and 3rd party software.
See the Microsoft ecosystem as an example. Nothing they do could not be replicated, but the network effects they achieved are strong. Too much glue, and 3rd party systems, and also training, and what users are used to, and what workers you could hire are used to, now all point to the MS ecosystem.
In this early mass-AI-use phase you still can easily switch vendors, sure. Just like in the 1980s you could still choose some other OS or office suite (like Star Office - the basis for OpenOffice, Lotus, WordStar, WordPerfect) without paying that kind of ecosystem cost, because it did not exist yet.
Today too much infrastructure and software relies on the systems from one particular company to change easily, even if the competition were able to provide a better piece of software in one area.
Good thing it's funded by generous investors or groups who are okay with losing money on every sale (they'll make it up in volume), and never stop funding, and never raise prices, insert ads or enshittify.
That's going full speed ahead though. Every major cloud provider has an AI offering, and there are now multiple AI-centric cloud providers. There is a lot of money and speculation. Now Nvidia has their own cloud offering that "democratize access to world-class AI infrastructure. Sovereign AI initiatives require a new standard for transparency and performance".
I did code before Google, and I was fine. Yes, it's really convenient, and LLM would be even more convenient if I could trust it just a little bit more, but it's quite possible to do some effective software development without Google.
There are many alternatives though. It is not like Google has a search monopoly or office product monopoly, or e-mail provider monopoly. It is quite possible to cut out a lot of Google from one's life, and not even complicated to do that.
>To those of us who like Free Software because of the freedom it gives us, this is a severe regression.
It's fair to be worried about depending on LLM. But I find the dependance on things like AWS or Azure more problematic, if we are talking about centralized and proprietary
Well, I'd think of it like being car-dependent. Sure, plenty of suburbanites know how to walk, they still have feet, but they live somewhere that's designed to only be practically traversable by car. While you've lived that lifestyle, you may have gained weight and lost muscle mass, or developed an intolerance for discomfort to a point where it poses real problems. If you never got a car, or let yourself adapt to life without one, you have to work backwards from that constraint. Likewise with the built environment around us; the cities many people under the age of 40 consider to be "good" are the ones that didn't demolish themselves in the name of highways and automobiles, in which a car only rarely presents what we'd think of as useful technology.
There are all kinds of trades that the car person and the non-car person makes for better or worse depending on the circumstance. The non-car person may miss out on a hobby, or not know why road trips are neat, but they don't have the massive physical and financial liabilities that come with them. The car person meanwhile—in addition to the aforementioned issues—might forget how to grocery shop in smaller quantities, or engage with people out in the world because they just go from point A to B in their private vessel, but they may theoretically engage in more distant varied activities that the non-car person would have to plan for further in advance.
Taking the analogy a step further, each party gradually sets different standards for themselves that push the two archetypes into diametrically opposed positions. The non-car owner's life doesn't just not depend on cars, but is often actively made worse by their presence. For the car person, the presence of people, especially those who don't use a car, gradually becomes over-stimulating; cyclists feel like an imposition, people walking around could attack at any moment, even other cars become the enemy. I once knew someone who'd spent his whole life commuting by car, and when he took a new job downtown, had to confront the reality that not only had he never taken the train, he'd become afraid of taking it.
In this sense, the rise of LLM does remind of the rise of frontend frameworks, bootcamps thay started with React or React Native, high level languages, and even things like having great internet; the only people who ask what happens in a less ideal case are the ones who've either dealt with those constraints first-hand, or have tried to simulate it. If you've never been to the countryside, or a forest, or a hotel, you might never consider how your product responds in a poor connectivity environment, and these are the people who wind up getting lost on basic hiking trails having assumed that their online map would produce relevant information and always be there.
Edit: To clarify, in the analogy, it's clear that cars are not intrinsically bad tools or worthwhile inventions, but had excitement for them been tempered during their rise in commodification and popularity, the feedback loops that ended up all but forcing people to use them in certain regions could have been broken more easily.
You can run LLMs locally pretty easily, especially if you have a Mac (the unified memory architecture of Macs is really good at this). It's a niche thing but caring about Free Software is niche.
You think an LLM provider has a bigger moat than an IDE (say pre vs code for a better parallel). MSDN and Jetbrains licenses are far more expensive than Cursor or Windsurf.
Sure, but that is not the point of the article. LLMs are useful. The fact that you are dependent on someone else is a different problem like being dependent on microsoft for your office suite.
IMO Github doesn't matter for FOSS because you have a lot of local clones, it won't disappear forever if Github goes down or deletes the repo there. Self-hosted alts are not 100% up either. And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.
Self-hosting has always have a lot of drawbacks compared with commercial solutions. I bet my self-host file server has worse reliability than Google Drive, or my self-host git server has worse number of concurrent user than github.
It's one thing you must accept when self-host.
So when you self-host LLM, you must either accept a drop in output quality, or spend a small fortune on hardware
Wake up, you’re already dependent on everything, unless you stick exclusively to Python std and no outside batteries.
Maven central is gone and you have no proxy setup or your local cache is busted? Poof, you’re fucking gone, all your Springs, Daggers, Quarkuses and every third party crap that makes up your program is gone. Same applies to bazillion JS, Rust libraries.
If PyPI goes out and I cannot use NumPy, I can still roll-out my own implementation of linear algebra library, because I've got the required knowledge, and I've got it because I had to learn it instead rely on LLMs.
Panamax works great for mirroring all of crates.io in 300-400GB, which is big but easily small enough for enthusiasts. I've got it on an external USB drive myself, and it's saved my bacon a few times.
We're not yet to that same point for performance of local LLM models afaict, though I do enjoy messing around them.
One thing that I find truly amazing is just the simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return. Like, as someone who grew up learning to code in the 90s it always seemed like science fiction that we'd get to a point where you could give a computer some vague human level instructions and get it more or less do what you want.
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
It is kind of impressive how I'll ask for some code in the dumbest, vaguest, sometimes even wrong way, but so long as I have the proper context built up, I can get something pretty close to what I actually wanted. Though I still have problems where I can ask as precisely as possible and get things not even close to what I'm looking for.
> This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
This is not the point of that Babbage quote, and no, LLMs have not solved it, because it cannot be solved, because "garbage in, garbage out" is a fundamental observation of the limits of logic itself, having more to with the laws of thermodynamics than it does with programming. The output of a logical process cannot be more accurate than the inputs to that process; you cannot conjure information out of the ether. The LLM isn't the logical process in this analogy, it's one of the inputs.
We wanted to check the clock at the wrong time but read the correct time. Since a broken clock is right twice a day, we broke the clock, which solves our problem some of the time!
It's very impressive that I can type misheard song lyrics into Google, and yet still have the right song pop up.
But, having taken a chance to look at the raw queries people type into apps, I'm afraid neither machine nor human is going to make sense of a lot of it.
Well, you can enter 4-5 relatively vague keywords into google and first/second stackoverflow link will probably provide plenty of relevant code. Given that, its much less impressive since >95% of the problems and queries just keep repeating.
Sure, you can now be fuzzy with the input you give to computers, but in return the computer will ALSO be fuzzy with the answer it gives back. That's the drawback of modern AI.
Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.
I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.
I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?
It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.
Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools
I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.
I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)
Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.
Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.
The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.
You can be fuzzier than a soft fluff of cotton wool. I’ve had incredible success trying to find the name of an old TV show or specific episode using AIs. The hit rate is surprisingly good even when using the vaguest inputs.
“You know, that show in the 80s or 90s… maybe 2000s with the people that… did things and maybe didn’t do things.”
“You might be thinking of episode 11 of season 4 of such and such snow where a key plot element was both doing and not doing things on the penalty of death”
See I try that sort of thing, like asking Gemini about a science fiction book I read in 5th grade that (IIRC) involved people living underground near/under a volcano, and food in pill form, and it immediately hallucinates a non-existent book by John Christopher named "The City Under the Volcano"
I was a big fan of Star Trek: The Next Generation as a kid and one of my favorite things in the whole world was thinking about the Enterprise's computer and Data, each one's strengths and limitations, and whether there was really any fundamental difference between the two besides the fact that Data had a body he could walk around in.
The Enterprise computer was (usually) portrayed as fairly close to what we have now with today's "AI": it could synthesize, analyze, and summarize the entirety of Federation knowledge and perform actions on behalf of the user. This is what we are using LLMs for now. In general, the shipboard computer didn't hallucinate except during most of the numerous holodeck episodes. It could rewrite portions of its own code when the plot demanded it.
Data had, in theory, a personality. But that personality was basically, "acting like a pedantic robot." We are told he is able to grow intellectually and acquire skills, but with perfect memory and fine motor control, he can already basically "do" any human endeavor with a few milliseconds of research. Although things involving human emotion (art, comedy, love) he is pretty bad at and has to settle for sampling, distilling, and imitating thousands to millions of examples of human creation. (Not unlike "AI" art of today.)
Side notes about some of the dodgy writing:
A few early epsiodes of Star Trek: The Next Generation treated the Enterprise D computer as a semi-omniscient character and it always bugged me. Because it seemed to "know" things that it shouldn't and draw conclusions that it really shouldn't have been able to. "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!" Thankfully someone got the memo and that only happened a few times. Although I always enjoyed episodes that centered around the ship or crew itself somehow instead of just another run-in with aliens.
The writers were always adamant that Data had no emotions (when not fitted with the emotion chip) but we heard him say things _all the time_ that were rooted in emotion, they were just not particularly strong emotions. And he claimed to not grasp humor, but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
ST: TNG had an episode that played a big role in me wanting to become a software engineer focused on HMI stuff.
It's the relatively crummy season 4 episode Identity Crisis, in which the Enterprise arrives at a planet to check up on an away team containing a college friend of Geordi's, only to find the place deserted. All they have to go on is a bodycam video from one of the away team members.
The centerpiece of the episode is an extended sequence of Geordi working in close collaboration with the Enterprise computer to analyze the footage and figure out what happened, which takes him from a touchscreen-and-keyboard workstation (where he interacts by voice, touch and typing) to the holodeck, where the interaction continues seamlessly. Eventually he and the computer figure out there's a seemingly invisible object casting a shadow in the reconstructed 3D scene and back-project a humanoid form and they figure out everyone's still around, just diseased and ... invisible.
I immediately loved that entire sequence as a child, it was so engrossingly geeky. I kept thinking about how the mixed-mode interaction would work, how to package and take all that state between different workstations and rooms, have it all go from 2D to 3D, etc. Great stuff.
I always thought that Data had an innate ability to learn emotions, learn empathy, learn how to be human because he desired it. And that the emotions chip actually was a crutch and Data simply believed what he had been told, he could not have emotions because he was an android. But, as you say, he clearly feels close to Geordi and cares about him. He is afraid if Spot is missing. He paints and creates music and art that reflects his experience. Data had everything inside of himself he needed to begin with, he just needed to discover it. Data, was an example to the rest of us. At least in TNG. In the movies he was a crazy person. But so was everyone else.
> The writers were always adamant that Data had no emotions... but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
This doesn't seem too different from how our current AI chatbots don't actually understand humor or have emotions, but can still explain a joke to you or generate text with a humorous tone if you ask them to based on samples, right?
> "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!"
I'm curious, do you recall a specific episode or two that reflect what you feel boiled down to this?
It's a radical change in human/computer interface. Now, for many applications, it is much better to present the user with a simple chat window and allow them to type natural language into it, rather than ask them to learn a complex UI. I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
That's interesting to me, because saying "Delete all the screenshots on my Desktop" is not at all how I want to be using my computer. When I'm getting breakfast, I don't instruct the banana to "peel yourself and leap into my mouth," then flop open my jaw like a guppy. I just grab it and eat it. I don't want to tell my computer to delete all the screenshots (except for this or that that particular one). I want to pull one aside, sweep my mouse over the others, and tap "delete" to vanish them.
There's a "speaking and interpreting instructions" vibe to your answer which is at odds with my desire for an interface that feels like an extension of my body. For the most part, I don't want English to be an intermediary between my intent and the computer. I want to do, not tell.
I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.
> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
Both are valid cases, but one cannot replace the other—just like elevators and stairs. The presence of an elevator doesn't eliminate the need for stairs.
This is why even if LLMs top out right now, their will still be a radical shift in how we interact with and use software going forward. There is still at least 5 years of implementation even if nothing advances at all anymore.
No one is ever going to want to touch a settings menu again.
For me this moment came when Google calendar first let you enter fuzzy text to get calendar events added, this was around 2011, I think. In any case, for the end user this can be made to happen even when the computer cannot actually handle fuzzy inputs (which is of course, how an LLM works).
The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do. I remember people behaving like that around stocks not that long ago…
> The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do
True, but I think this is just the zeitgeist. People today want to share their dumb opinions about any complex subject after they saw a 30 second reel.
Though I haven’t embraced LLM codegen (except for non-functional filler/test data), the fuzziness is why I like to use them as talking documentation. It makes for a lot less of fumbling around in the dark trying to figure out the magic combination of search keywords to surface the information needed, which can save a lot of time in aggregate.
Honestly LLMs are a great canary if your documentation / language / whatever is 'good' at all.
I wish I would have kept it around but had ran into an issue where the LLM wasn't giving a great answer. Look at the documentation, and yea, made no sense. And all the forum stuff about it was people throwing out random guessing on how it should actually work.
If you're a company that makes something even moderately popular and LLMs are producing really bad answers there is one of two things happening.
1. Your a consulting company that makes their money by selling confused users solutions to your crappy product
2. Your documentation is confusing crap.
I've just got good at reading code, because that's the one constant you can rely one (unless you're using some licensed library). So whenever the reference is not enough, I just jump straight to the code (one of my latest examples is finding out that opendoas (a sudo replacement) hard code the persist option for not asking password to 5 minutes).
In my opinion, most of the problems we see now with LLMs come from being fuzzy ... I'm used to getting very good code from claude o gemini (copy and paste without any changes that just works) but I have to be very specific, sometime it takes longer to write the prompt than writing the code itself.
If I'm fuzzy, the output quality is usually low and I need several iterations before getting an acceptable result.
At some point, in the future, there will be some kind of formalization on how to ask swe question to llms ... and we will get another programming language to rule the all :D
But when I'm doing my job as a software developer, I don't want to be fuzzy. I want to be exact at telling the computer what to do, and for that, the most efficient way is still a programming language, not English. The only place where LLMs are an improvement is voice assistants. But voice assistants themselves are rather niche.
>simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return
I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want. Worth reading Dijkstra, who anticipated this, and the foolishness of it, half a century ago
"Instead of regarding the obligation to use formal symbols as a burden, we should regard the convenience of using them as a privilege: thanks to them, school children can learn to do what in earlier days only genius could achieve. (This was evidently not understood by the author that wrote —in 1977— in the preface of a technical report that "even the standard symbols used for logical connectives have been avoided for the sake of clarity". The occurrence of that sentence suggests that the author's misunderstanding is not confined to him alone.) When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.[...]
It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable"
Welcome to prompt engineering and vibe coding in 2025, where you have to argue with your computer to produce a formal language, that we invented in the first place so as to not have to argue in imprecise language
right: we don't use programming languages instead of natural language simply to make it hard. For the same reason, we use a restricted dialect of natural language when writing math proofs -- using constrained languages reduces ambiguity and provides guardrails for understanding. It gives us some hope of understanding the behavior of systems and having confidence in their outputs
There are levels of this though -- there are few instances where you actually need formal correctness. For most software, the stakes just aren't that high, all you need is predictable behavior in the "happy path", and to be within some forgiving neighborhood of "correct".
That said, those championing AI have done a very poor job at communicating the value of constrained languages, instead preferring to parrot this (decades and decades and decades old) dream of "specify systems in natural language"
It sounds like you think I don't find value in using machines in their precise way, but that's not a correct assumption. I love code! I love the algorithms and data structures of data science. I also love driving 5-speed transmissions and shooting on analog film – but it isn't always what's needed in a particular context or for a particular problem. There are lots of areas where a 'good enough solution done quickly' is way more valuable than a 100% correct and predictable solution.
That’s interesting. I got into computing because unlike school where wrong answers gave you indelible red ink and teachers had only finite time for questions, computers were infinitely patient and forgiving. I could experiment, be wrong, and fix things. Yes I appreciated that I could calculate precise answers but it was much more about the process of getting to those answers in an environment that encouraged experimentation. Years later I get huge value from LLMs, where I can ask exceedingly dumb questions to an indefatigable if slightly scatterbrained teacher. If I were smart enough, like Dijkstra, to be right first time about everything, I’d probably find them less useful, but sadly I need cajoling along the way.
"I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want."
So you didn't get into this profession to be lead then eh?
Because essentially, that's what Thomas in the article is describing (even if he doesn't realize it). He is a mini-lead with a team of a few junior and lower-mid-level engineers - all represented by LLM and agents he's built.
Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.
But thankfully we do have feedback/interactiveness to get around the downsides.
When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.
Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.
It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool
(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)
>On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question.
- Charles Babbage
If anything we now need to unlearn the rigidity - being too formal can make the AI overly focused on certain aspects, and is in general poor UX. You can always tell legacy man-made code because it is extremely inflexible and requires the user to know terminology and usage implicitly lest it break, hard.
For once, as developers we are actually using computers how normal people always wished they worked and were turned away frustratedly. We now need to blend our precise formal approach with these capabilities to make it all actually work the way it always should have.
I'm not a skeptic, but I keep LLMs on a short leash.
This is a thoughtful article. Thanks `tptacek
My LLM use is:
1 - tedious stuff; web pages interacting with domain back end.
2 - domain discovery.
In a recent adventure, I used Claude 4 to tease out parameters in a large graph schema. This is a combination of tedium and domain discovery (it's not my graph and I'm not a domain expert). In the first day, Claude uncovered attributes and relations no other LLM or Google search uncovered. And it worked!! The next day, I allowed it to continue. After a bit, results didn't pass the sniff test.
I checked into details of Claude's thinking: it decided to start making up schema attributes and inventing fallback queries on error with more made up attributes. It was "conscious" of its decision to do so.
By the time I caught this, Claude had polluted quite a bit of code.
Sure, plenty of well placed git commits helped in rolling back code...but it's not quite that simple..over the many git commits were sprinkled plenty of learnings I don't want to toss. It took another two days of carefully going through the code to pull out the good stuff and then roll things back. So now I'm at day five of this adventure with cleaned up code and notes on what we learned.
I suspect continual improvements on tooling will help. Until then, it's a short leash.
Couldn't agree more. The first time I used Claude Code was for something very much like this. We had a PDF rendering issue with Unicode characters in one of our libraries. We ultimately needed to implement a sort of bespoke font fallback system.
With the help of the agent, I was able to iterate through several potential approaches and find the gaps and limitations within the space of an afternoon. By the time we got to the end of that process the LLM wrote up a nice doc of notes on the experiments, and *I* knew what I wanted to do next. Knowing that, I was able to give a more detailed and specific prompt to Claude which then scaffolded out a solution. I spent probably another day tweaking, testing, and cleaning up.
Overall I think it's completely fair to say that Claude saved me a week of dev time on this particular task. The amount of reading and learning and iterating I'd have had to do to get the same result would have just taken 3-4 days of work. (not to mention the number of hours I might have wasted when I got stuck and scrolled HN for an hour or whatever).
So it still needed my discernment and guidance - but there's no question that I moved through the process much quicker than I would have unassisted.
That's worth the $8 in API credit ten times over and no amount of parroting the "stochastic parrot" phrase (see what I did there?) would change my mind.
> Domain discovery is so underrated. LLMs remove so much friction that makes everything so incredibly accessible.
And, unfortunately, a lot of friction from not having access to information in the first place. I've read a bunch of docs from people talking to Glean in order to explore a new topic; if it's a topic I'm actually very familiar with then four out of five times it is somewhere from misleading to catastrophically wrong. Any internal terminology that doesn't match the common usage outside of our organization poisons the whole session and it will make up things to join the meanings together, and the prompter is none the wiser.
I trust AI only as a gap filler in domains that I'm already an expert in or where there's little internal context, anything else is intellectual suicide.
One question is whether, even after all that backpedaling, you feel you could've achieved the same or a similar result in those five days. My findings have been that it's a net plus for productivity, but I'm a bit less sure whether I prefer the way work feels when a lot of it is just going back and cleaning up after the growth. (Of course, that sounds like a familiar statement for a lot of engineers before LLMs, too.)
Yeah I'm impressed with its ability to do stuff, but not quite with its results. We have been working on more AI assistance adoption so I asked it to do some decently complex things with json/yml schema definitions and validations (outside the typical json schema we wanted things like conditional validation, etc)... It wrote a LOT of code and took a long time, and kept telling me it would work, and it didn't. I finally stepped in and eliminated roughly 75% of the code in about 10 minutes and got it working. It's great at tedious stuff, but outside of that, I'm skeptical.
I used it about 15 minutes ago, to help me diagnose a UI issue I was having. It gave me an answer that I would have figured out, in about 30 minutes, in about 30 seconds. My coding style (large files, with multiple classes, well-documented) works well for AI. I can literally dump the entire file into the prompt, and it can scan it in milliseconds.
I also use it to help me learn about new stuff, and the "proper" way to do things.
Basically, what I used to use StackOverflow for, but without the sneering, and much faster turnaround. I'm not afraid to ask "stupid" questions -That is critical.
Like SO, I have to take what it gives me, with a grain of salt. It's usually too verbose, and doesn't always match my style, so I end up doing a lot of refactoring. It can also give rather "naive" answers, that I can refine. The important thing, is that I usually get something that works, so I can walk it back, and figure out a better way.
I also won't add code to my project, that I don't understand, and the refactoring helps me, there.
I have found the best help comes from ChatGPT. I heard that Claude was supposed to be better, but I haven't seen that.
I don't use agents. I've not really ever found automated pipelines to be useful, in my case, and that's sort of what agents would do for me. I may change my mind on that, as I learn more.
What I like about Chatbots vs SO is the ability to keep a running conversation instead of 3+ tabs and tuning the specificity toward my problem.
I've also noticed that if I look up my same question on SO I often find the source code the LLM copied. My fear is that if chatbots kill SO where will the LLM's copied code come from in the future?
I use Perplexity as my daily driver and it seems to be pretty good at piecing together the path forward from documentation as it has that built-in web search when you ask a question. Hopefully LLMs go more in that direction and less in the SO copy-paste direction, sidestepping the ouroboros issue.
>>I'm not afraid to ask "stupid" questions -That is critical.
AI won't judge and shame you in front of the whole world, for asking stupid questions, or not RTFM'ing well enought, like Stackoverflow users do. Nor will it tell you, your questions are irrelevant.
I’ve always worked that way. In school (or in seminars), I ask questions that may have the whole room in stitches, but I always learn the lesson. The worst teacher I ever had, was a genius calculus professor, who would harangue you in front of the class, for asking a “stupid” question. That’s the only class I ever took an Incomplete.
That’s the one thing about SO that I always found infuriating. It seems their favorite shade, is inferring that you’re “lazy,” and shaming you for not already having the answer. If anyone has ever looked at my code, “lazy” is probably not a word that springs to mind.
In most cases, I could definitely get the answer, myself, but it would take a while, and getting pointers might save me hours. I just need a hint, so that I can work out an answer.
With SO, I usually just bit my tongue, and accepted the slap, as well as the answer.
An LLM can actually look at a large block of code, and determine some boneheaded typo I made. That’s exactly what it did, yesterday. I just dumped my entire file into it, and said “I am bereft of clue. Do you have any idea why the tab items aren’t enabling properly?”. It then said “Yes, it’s because you didn’t propagate the tag from the wrapper into the custom view, here.” It not only pointed out the source error, but also explained how it resulted in the observed symptoms.
In a few seconds, it not only analyzed, but understood an entire 500-line view controller source file, and saw my mistake, which was just failing to do one extra step in an initializer.
There’s absolutely no way that I could have asked that question on SO. It would have been closed down, immediately. Instead, I had the answer in ten seconds.
I do think that LLMs are likely to “train” us to not “think things through,” but they said the same thing about using calculators. Calculators just freed us up to think about more important stuff. I am not so good at arithmetic, these days, but I no longer need to be. It’s like Machine Code. I learned it, but don’t miss it.
Agree on the verbosity and occasional naivety. But the fact that it gives working starting points is what really moves the needle. It gets me unstuck faster, and I still get to do the creative, architectural stuff
I’ll ask it how to accomplish some task that I’ve not done, before, and it will give me a working solution. It won’t necessarily be a good solution, but it will work.
I can then figure out how it got there, and maybe determine a more effective/efficient manner.
The reaction to this article is interesting. I have found AI to be useful in software contexts that most people never exercise or expect based on their intuitions of what an LLM can do.
For me, a highly productive but boring use of LLMs for code is that they excel at providing midwit “best practice” solutions to common problems. They are better documentation than the documentation and can do a lot of leg work e.g. Linux syscall implementation details. My application domains tend to require more sophisticated solutions than an LLM can provide but they still save a lot of rote effort. A lot of software development exists almost entirely in the midwit zone.
Much more interesting, they are decent at reducing concepts in literature to code practice for which there are no code examples. Google and StackOverflow turn up nothing. For example, I’ve found them useful for generating specialized implementations of non-Euclidean computational geometry algorithms that don’t really exist in the wild that I’ve ever seen. This is a big win, it literally turns months of effort into hours of effort.
On the other hand, I do a lot of work with algorithms that don’t exist in literature, never mind public code, with extremely performance-engineered implementations. There is an important take away from this too: LLMs are hilariously bad at helping with this but so are human software developers if required to do the same thing with no context.
Knowledge for which there is little or no training data is currently a formidable moat, both for LLMs and humans.
I agree on the distillation step. They're very good at identifying what problem you're trying to solve, especially if you don't have much experience in the domain. I've found ChatGPT great for figuring out problems involving combinatorics. And then you can be canny and ask the LLM to write the next prompt to get some boilerplate code. I like to do this with ChatGPT 4.5 (heavily rate limited) targeting 4o.
Getting good code is still a pain with chat, but you can go a fair way by reducing the context as much as possible and only asking for core functions (no point wasting hundreds of tokens on an arg parser). I'd like to play with Cursor or Zed a bit more.
Debugging sometimes works. 4.5 often figures out errors in one, 4o often struggles and spits out suggestions like it's a customer support agent (it also falls into repetition fairly often if you tell it "no, so what else could it be?"). You really have to work to convince it up front that you're not an idiot.
What? This is exactly what I would expect an LLM to be good at. Stack Overflow gives common solutions to common problems. LLMs give niche solutions to problems that aren't common enough for extensive discussion, but are still already solved if you find the right repository or study the documentation thoroughly.
Neither helps with the hardest and most time consuming problems faced by software engineers: solving problems that have never been (publicly) solved before, making hard decisions based on intuition about what will be necessary in the future, understanding the interactions between broad webs of dependency in large codebases, and working with proprietary domain knowledge.
It's fascinating how over the past year we have had almost daily posts like this one, yet from the outside everything looks exactly the same, isn't that very weird?
Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?
I invite tptacek, or any other chatbot enthusiast around, to publish project metrics and show some actual numbers.
"Why haven't we seen an explosion of new start-ups, products or features?"
You're posting this question on a forum hosted by YC. Here's a story from March 2024: "YC’s latest W24 batch includes 240 companies. A significant portion of the companies have some AI component, with 63% tagged as “Artificial Intelligence” — a notable increase from 51% in the preceding S23 batch and 29% before that.". https://jamesin.substack.com/p/analysis-of-ycs-latest-w24-ba...
I've not seen the same analysis for more recent batches.
Most likely there’s a slight productivity increase.
The enthusiasts have a cognitive dissonance because they are pretty sure this is huge and we’re living in the future, so they go through various denial strategies when the execs ask them where the money is.
In this case it’s blame. These darned skeptics are ruining it for everyone.
This is an important question. The skepticism tracks with my personal experience - I feel 10-20% more productive but certainly not 5x when measured over a long period of time (say, the last 6 months or more)
I’m nonetheless willing to be patient and see how it plays out. If I’m skeptical about some grandiose claims I must also be equally skeptical and accepting about the possibility of large scale effects happening but not being apparent to me yet.
There were many similar transformations in recent decades. I remember first Windows with true Graphics User Interface was big WOW: productivity boost, you can have all those windows and programs running at the same time! Compare it with DOS where you normally had just one active user-facing process.
> Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?
In my personal experience (LLM and code suggestion only) it's because I use LLMs to code unimportant stuff. Actually thinking what I want to do with the business code is exhausting and I'd rather play a little with a fun project. Also, the unit tests that LLMs can now write (and which were too expensive to write myself) were never important to begin with.
Simply put, if we’re living during such a major technological revolution, why does using software suck in such disastrous ways that were unthinkable even ten years ago?
Your argument relies on the idea of an "actual product", what is happening—and I’m seeing it firsthand both in my company’s codebase and in my personal projects—is that AI is contributing more and more to product development. If this trend continues, we may reach a point where 90% of a product is written by AI.
At that stage, the real value will lie in the remaining 10%—the part that requires human judgment, creativity, or architectural thinking. The rest will be seen as routine: simple instructions, redundant CRUD operations, boilerplate, and glue code.
If we focus only on the end result, human will inevitably write less code overall. And writing less code means fewer programming jobs.
You said a bunch without saying much. It also doesn't track. If the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects?
Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?
Call me naive, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that.
I have no doubt that people find use in some aspects of these tools. Though I personally more subscribe to the interactive rubber ducky usage of them. But 90% from where I am standing seems like a very, very far way off.
i don't know - I agree we haven't seen changes to our built environment, but as for an "explosion of new start-ups, products" we sort of are seeing that?
I see new AI assisted products everyday, and a lot of them have real usage. Beyond the code-assistants/gen companies which are very real examples, here's an anecdote.
I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.
Then i wanted to plan a trip - tried google but saw nothing useful, and then asked chatgpt and now have a clear plan
> I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.
I am not seeing anything indicating it is actually used by a ton of journalists and serious writers. I highly doubt it is, the FAQ is also paper thin in as far as substance goes. I highly doubt they are training/hosting their own models yet I see only vague third party references in their privacy policy. Their pricing is less than transparent given that they don't really explain how their "credits" translate to actual usage. They blatantly advertise this to be for students, which is problematic in itself.
This ignores all the other issues around so heavily depending on LLMs for your writing. This is an interesting quirk for starters: https://www.theguardian.com/technology/2024/apr/16/techscape... . But there are many more issues about relying so heavily on LLM tools for writing.
So this example, to me, is actually exemplifying the issue of overselling capabilities while handwaving away any potential issues that is so prevalent in the AI space.
So as an example of what this could look like that would be convincing to me. I started out pretty firmly believing that Rust was a fad.
Then Mozilla and Google did things with it that I did not think were possible for them to do. Not "they wrote a bunch of code with it", stuff like "they eliminated an entire class of bugs from a section of their codebase."
Then I watched a bunch of essentially hobby developers write kernel drivers for brand new architecture, and watched them turn brand new Macbooks into one of the best-in-class ways to run Linux. I do not believe they could have done that with their resources at that speed, using C or C++.
And at that point, you kind of begrudgingly say, "okay, I don't know if I like this, but fine, heck you, whatever. I guess it might genuinely redefine some parts of software development, you win."
So this is not impossible. You can convince devs like me that your tools are real and they work.
And frankly, there are a billion problems in modern computing that are high impact - stuff like Gnome accessibility, competitive browser engines, FOSS UX, collaboration tools. Entire demographics who have serious problems that could be solved by software if there was enough expertise and time and there were resources to solve them. Often, the issue at play is that there is no intersection between people who are very well acquainted with those communities and understand their needs, and people who have experience writing software.
In theory, LLMs help solve this. In theory. If you're a good programmer, and suddenly you have a tool that makes you 4x as productive as a developer: you could have a very serious impact on a lot of communities right now. I have not seen it happen. Not in the enterprise world, but also not in the FOSS world, not in communities with lower technical resources, not in the public sector. And again, I can be convinced by this, I have dismissed tools that I later switched opinions on because I saw the impact and I couldn't ignore the impact: Rust, NodeJS, Flatpak, etc, etc.
The problem is people have been telling me that Coding Assistants (and now Coding Agents) are one of those tools for multiple years now, and I'm still waiting to see the impact. I'm not waiting to see how many companies pick them up, I'm not waiting to see the job market. I'm waiting to see if this means that real stuff starts getting written at a higher quality significantly faster, and I don't see it.
I see a lot of individual devs showing me hobby projects, and a lot of AI startups, and... frankly, not much else.
One of our AI enabled internal projects is moving ~135 faster than before. Of course you can't perfectly compare. New framework, better insights, updated libraries etc.
If you end up finishing it in 6 months, are you going to revise that estimate, or celebrate the fact that you don't need to wait until 2092 to use the project?
My main concern is not even mentioned in this article and there are hardly any comments here addressing it: Privacy / allowing 3rd parties to read and potentially train on your proprietary source code.
I've used LLMs to crank out code for tedious things (like generating C-APIs and calling into poorly documented libraries) but I'm not letting them touch my code until I can run it 100% locally offline. Would love to use the agentic stuff but from what I've heard it's still too slow to run on a high end workstation with a single 4080.
Or have things got better lately, and crucially is there good VisualStudio integration for running local agents / LLMs?
The problem is, it's like ads. Even if it's a paid subscription, the incentive to hoover up your code, metadata about your code or everything they can get away with is just too strong. I think it will just happen at some point, unless you run them 100% locally or hosted somewhere but self managed.
Yes and even if they were squeaky clean and didn't do anything bad with your code, there's also the possibility that they've been hacked and code is getting siphoned off somewhere.
If you're big enough, you pay for on-prem frontier level tools. If not, you can do all right with a good GPU and local model, but it's not the same. And yeah there are plenty of plugins like Continue that support local or on-prem hosted models.
No I definitely don't use GitHub. Everything is entirely in-house.
But even if I did, there's a much more solid foundation of trust there, whereas these AI companies have been very shady with their 'better to ask for forgiveness, than permission' attitudes of late.
So would you be interested in a simple serverless service that lets you run LLM’s and is focused on privacy and security first instead of trying to run it on an expensive workstation with a single 4080?
I love LLMs, and I really like programming with Cursor, but I never managed to get the "agents with tons of stuff in their context" mode to work for me. I use Cursor like a glorified code completer, 4-5 lines at a time, because otherwise the LLM just makes too many mistakes that compound.
If you let it run in the "write my code for me" mode, and ask it to fix some mistake it made, it will always add more code, never remove any. In my experience, in the end the code just ends up so brittle that the LLM will soon get stuck at a point that it never manages to overcome some mistake, no matter how many times it tries.
Not a full solution, but one thing I've learned not to do is tell Cursor "you got that wrong, fix it like this". Instead, I go back to the previous prompt and click "Restore Checkpoint", edit the prompt and possibly the Cursor rules to steer it in the right direction.
When the model has the wrong solution in its context, it will use it when generating new code, and my feeling is that it doesn't handle the idea of "negative example" very well. Instead, delete the bad code and give it positive examples of the right approach.
Make sure you include lots of files of context (including any type definitions!) After a big change before approving any code ask it: "Is this the simplest and cleanest approach?" Usually it will make more revisions and actually produce clean code then. You can also say that in the original prompt, or specify the structure of the change you want it to make.
No. And I don't think they are doing anything magical. Performance drop sharply after 50k tokens. Your LLM does best when you have a short 2-5K context.
imo cursor's small context window on the $20 plan is what kills their agent mode. Try claude code or anything that lets you use larger context windows (I think cursor has a tier you can pay for now?). claude code regularly deletes and corrects code and LLMs are very much capable of that today.
I would love to know as well. I also have problems with the LLM coding agents when it comes to lesser-known languages like Julia. Has anybody solved this yet?
in the last two weeks I've started using Cursor in the 'you do everything, I'm just your boss mode', to see how far I can push it. Right at this minute, I'm working on something thats become pretty big but I'm constantly on the verge of just going back to writing code like normal LOL.
That said, I'm super impressed by how quickly I've been able to get this far with something pretty niche and complex in places.
Here's what I've learned. There are million AI Bros on youtube who have the ultimate solution but they all boil down to a few basic things.
Make rules: make them before you get started and continue updating them as you go.
Lots of tiny modules: Push things into small bite sized bits of code with lots of interface documentation. This feels a bit unnatural when the code is prototype quality.
Documentation is key: the youtubers will often create detailed planning and specification documents in advance. I done this and it's hit and miss, what I've found works is explain what you want to build to an llm and have it create extremely concise documentation, then a rough checklist for implementation and then evolving these documents in cursor as I go.
This leads to a kind of plan --> document, implement --> document, run tests --> document workflow on each reasonable chunk of the design being the most smooth.
Don't let cursor ever see inside big datasets or huge folder trees. In fact keep cursor in a few folders writing source code and nothing else. To do this, early on build tools that can go and get information or make safe edits for cursor from those datasets without it attempting direct access.
The current project has a tools for working with the primary datasets, a job manager, a process manager, a log manager, a workflow manager, these all have functions for querying. Cursor is instructed to use these. It naturally doesn't want to but if you tell it 7 times out 10 it will :)
No mess: Cursor will like to make lots of random tests and processes along the way. instruct it to use a /debug/tests/ folder and wipe it clean often. force it to make 'production' code by having it be registered with the workflow manager and made allowable by process manager. This lets it play around for a while and get something up and running with it's weird little scripts and then implement it for real elsewhere and using the proper framework. The workflow manager needs to have documentation on how the script is used and the process manager needs to the source to be in a particular place with a particular set of standard interfaces.
you might say this is a lot of plumbing, but what isn't these days and it's not like I'm maintaining it - right ;)
Cursor is always about to go insane any minute, or can't remember what's going on. So most of it is about keeping things simple enough for it to focus on the bit it's working on and nothing else.
Regularly for work we have massive code bases written in our company style and built using our business processes. This kind of thing is absolutely not the place for Cursor. I can imagine a company setup from scratch to use Cursor but our current products and practices woud just make it impossibly time-consuming.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.
A junior developer often has negative value to a team, because they're sapping the time of more senior developers who have to help train them, review code, fix mistakes, etc. It can take a long while to break even.
The raw cost of Cursor's subscription is surely dwarfed by your own efforts, given that description. The actual calculous here should be the cost to corral Cursor, against the value of the code it generated.
So what is the end game here, we don't have any juniors? That's what we're celebrating? This whole thing makes no damn sense, we're just becoming more efficient at producing bad code at scale, which will be maintained by a workforce that is growing less proficient at that with the increasing adoption of AI - skills will decline and eventually we'll be living with more buggy software, maintained by underpaid, unappreciated and overwhelmed grey beards and it will be too late to turn back.
I had a call with a recruiter the other day, who told me that they had simply stopped hiring juniors at all, when not too long ago they would be camping out at colleges on graduation day to snipe fresh graduates.
This is going to a problem real soon. There needs to be a realistic career path for software developers, or the entire field is at risk.
I don't think that's necessarily true. Some companies and seniors certainly invest into training and mentoring their juniors and this can take a long time and be expensive.
But there are a lot of places offer 3-6 months internships for people that are barely out of high school. Others heavily rely on overseas contractors.
So in a lot of scenarios it does seem possible to use less capable labor in a beneficial way for software engineering.
Hundreds of comments. Some say LLMs are the future. Others say they don't work today and they won't work tomorrow.
Videogame speed running has this problem solved. Livestream your 10x engineer LLM usage, a git commit annotated with it's prompt per change. Then everyone will see the result.
This doesn't seem like an area of debate. No complicated diagrams required. Just run the experiment and show the result.
I literally do this about twice a week on Twitch. I’m working on some hobby projects, and one constraint I’ve put on myself is to use LLMs for pretty much everything, regardless of whether it saves me time or not. The reason is twofold: I want to learn how to use them efficiently, and I want to constantly test the waters to see where the limits of their capabilities are. On my stream, you can see LLMs fail spectacularly one time and do hours of work in minutes another time.
I’m not alone in this - there are tons of other examples of people showing how they use LLMs online; you just need to search for them.
The article provides zero measurement, zero examples, zero numbers.
It's pure conjecture with no data or experiment to back it up. Unfortunately conjecture rises to the top on hackernews. A well built study on LLM effectiveness would fall off the front page quickly.
People always say "you just need to learn to prompt better" without providing any context as to what "better" looks like. (And, presumes that my prompt isn't good enough, which maybe it is maybe it isn't.)
The easy way out of that is "well every scenario is different" - great, show me a bunch of scenarios on a speed run video across many problems, so I can learn by watching.
It's because you get to the No True Scotsman -thing pretty fast.
If I use LLMs to code, say a Telegram bot that summarise the family calendars and current weather to a channel - someone will come in saying "but LLMs are shit because they can't handle this very esoteric hardware assembler I use EVERY DAY!!1"
The Primeagen did this. He locked himself in for 7 weeks and tried to build a game by only using cursor. The videos are more entertainment style but this is the closest thing i know of
Unlikely. Programming in highly collaborative and efficiency is hard to measure. That creates incentives for programmers in competition to typically prioritize advertising their skills by demonstration over maintaining their secret edge.
Be it at work or on the internet, if you help others by sharing your techniques you'll make them want to work with you and impress them with how smart you are.
If you are keeping it all secret to maintain your edge, people will think of you as unhelpful and they won't know how smart you are, because it's very difficult to judge how difficult the things our accomplished were.
The reason people don't stream themselves vibe coding is that's it's even less interesting to watch than regular coding.
I must disagree. Sharing your edge is the wisest possible thing you can do on a societal level. For a slightly silly idea would it be better to have say, everyone doing guesswork knots for how to tie their shoes vs a single reliable 'rabbit ears' technique? Then you can see the benefits to having edges widely shared as a norm. That is the foundation of how society can learn.
I see all the negative responses, but this seems true to me. I am old enough to remember the dot com days and could see the transformative effect of the Internet from miles away when I was a teenager. Yet many, many people refused to acknowledge that someday soon we would do things like credit card transactions online, or that people might buy shoes without trying them on first, etc.
You could say it is a lack of imagination or not connecting the dots, but I think there is a more human reason. A lot of people don't want the disruption and are happy with the status quo. I'm a software engineer so I know how problematic AI may be for my job, but I think anyone who looks at our current state and the recent improvements should be able to see the writing on the wall here.
I for one am more curious than afraid of AI, because I have always felt that writing code was the worst part of being a programmer. I am much happier building product or solving interesting problems than tracking down elusive bugs or refactoring old codebases.
So, programmers once had an edge in having their source closed, then fell for the open source evangelism and started sharing their code, which enabled the training of AI models, and now the next iteration of what was called programmers before and is now known as vibe coders has this notion of having an edge in having their chatbot prompts closed again?
Let's all just muse some and imagine what the next cycle of this wheel will look like.
I have one very specific retort to the 'you are still responsible' point. High school kids write lots of notes. The notes frequently never get read, but the performance is worse without them: the act of writing them embeds them into your head. I allegedly know how to use a debugger, but I haven't in years: but for a number I could count on my fingers, nearly every bug report I have gotten I know exactly down to the line of code where it comes from, because I wrote it or something next to it (or can immediately ask someone who probably did). You don't get that with AI. The codebase is always new. Everything must be investigated carefully. When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it. When humans do not do the work, humans do not accrue the experience. (This may still be a good tradeoff, I haven't run any numbers. But it's not such an obvious tradeoff as TFA implies.)
I have to completely agree with this and nobody says this enough.
This tradeoff of unfamiliarity with the codebase is a very well understood problem for decades. Maintaining a project is 99% of the time spent on a successful project.
In my opinion though, having AI write the initial code is just putting most people in a worse situation with almost no upside long term.
I agree I'm bullish on AI for coding generally, but I am curious how they'd get around this problem. Even if they can code at super human level, then you just get rarer super human bugs. Or is another AI going to debug it? Unless this loop is basically fail proof, does the human's job just becoming debugging the hardest things to debug (or at least a blindspot of the AI)
> When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it.
I don’t know. Ever had the experience of looking at 5+ year old code and thinking “what idiot wrote this crap” and then checking “git blame” and realising “oh, I’m the idiot… why the hell did I do this? struggling to remember” - given enough time, humans start to forget why they did things a certain way… and sometimes the answer is simply “I didn’t know any better at the time, I do now”
> You don't get that with AI. The codebase is always new.
It depends on how you use AI… e.g. I will often ask an AI to write me code to do X because it gets me over the “hump” of getting started… but now this code is in front of me on the screen, I think “I don’t like how this code is written, I’m going to refactor it…” and by the time I’m done it is more my code than the AI’s
Understanding code takes more effort than writing it, somehow. That's always been a huge problem in the industry, because code you wrote five years ago was written by someone else, but AI coding takes that from "all code in your org except the code you wrote in the past couple years" to "all code was written by someone else".
How well does your team work when you can't even answer a simple question about your system because nobody wrote, tested, played with the code in question?
How do you answer "Is it possible for our system to support split payments?" when not a single member of your team has even worked on the billing code?
No, code reviews do not familiarize an average dev to the level of understanding the code in question.
Oddly, I don't tend to get that experience very much. More often, it's "That's not how I'd naively write that code, there must be some catch to it. If only I had the foresight to write a comment about it..." Alas, I'm still not very good at writing enough comments.
Similar to almost self driving cars where you are still responsible. You're asking someone to do nothing at all other than being highly alert for long periods of time. That's just not how people work. There is no way someone can be ready to take over in an instant without actively engaging in the driving.
I read a study[1] (caveat, not peer reviewed yet I don't think?) that seems to imply that you are correct.
< When using GenAI tools, the effort invested in critical thinking
< shifts from information gathering to information verification;
< from problem-solving to AI response integration; and from task
< execution to task stewardship.
This is a good point I think, and these steps take time and should definitely be done. I'm not sure people take this into account when talking about having AI code for them.
So do the thing that a student copying their notes from the board does: look at the PR on one monitor, and write your own equivalent PR by typing the changes line-for-line into your IDE on the other. Pretend copy/paste doesn’t exist. Pretend it’s code you saw in a YouTube video of a PowerPoint presentation, or a BASIC listing from one of those 1980s computing magazines.
(And, if you like, do as TFA says and rephrase the code into your own house style as you’re transcribing it. It’ll be better for it, and you’ll be mentally parsing the code you’re copying at a deeper level.)
This is how a (video game) programming class in my high school was taught. You had to transcribe the code from a Digipen book.... then fix any broken code. Not entirely sure if their many typos were intentional, but they very much helped learn because we had no choice but to correct their logic failures and taypos to move onto the next section. I'm still surprised 20 years later how well that system worked to teach and push us to branch our understandings.
100%, i had gemini write code for a blog in golang - it has some bugs and it took me a some time to find them.
To me the sweet spot is, i write the code with the "Help" of an LLM. It means i double check everything it generates and prompt it to write code block by block - frequently acting as an editor.
Either you want human intervention for correctness and extension or you don't. Having LLM's write large swaths of code is like completely relying on tesla's autopilot - you are probably more stressed than if you just drove yourself.
The careful vetting of code and thoroughly testing it is super important, I would never even think of putting any generated code into any use without doing that.
Also your last comparison made me chuckle, good one :)
Writing code is easier than long term maintenance. Any programmer is able to write so much code that he will not be able to maintain it. Unless there are good AI tools helping with maintenance there is no point to use generative tools for production code. From my experience AI tools are great for prototyping or optimizing procrastination.
This level of knowledge is nearly impossible to maintain as the codebase grows though, beyond one or two people at a typical company. And tools need to exist for the new hire as well as the long-standing employee.
Welcome to project architecting, where the job isn't about putting more lines of code into this world, but more systems in place to track them. A well layered and structured codebase can grow for a very long time before it becomes too hard to maintain. And generally, the business complexity bites before the algorithmic one, and there's no quick fix for that.
AI tools are enabling the developer to a higher level of abstraction to engineering manager or product manager. Those roles do not need to be familiar with code in that detail.
It might do it later... but odds are it'll requiring ingesting a post a human made about the exact problem while listing out what the correct output should have been.
The important thing you are missing is that the learning landscape has now changed.
You are now responsible for learning how to use LLMs well. If an untrained vibe coder is more productive for me, while knowing nothing about how the code actually works, I will hire the vibe coder instead of you.
Learning is important, but it's most important that you learn how to use the best tools available so you can be productive. LLMs are not going away and they will only get better, so today that means you are responsible for learning how to use them, and that is already more important for most many roles than learning how to code yourself.
This is actually a good reason for exiting the industry before one's job goes away. Steering AI to barf up the right-looking pool of vomit is not the Flow-generating experience that many people have started to program for.
Title nitpick: The amount is people who care about AI for coding assistance is a relative minority. For everyone else, there's 'AI', which has a huge branding problem. 'AI' is filling all search results with trash, and creating trash websites full of trash to fill up the rest of the search results. It's generating trash images to put at the top of every blog post. It's eating up all the server load with scraping. It's what's been fooling my dad every day on Facebook.
When people are sick of AI, this is what they are talking about. AI hype people ignore this perspective each and every time.
It doesn't matter how great your paper Mill's paper is, if you're dumping PCBs in the river, people are going to quite rightly get pissed off.
AI has a huge branding problem on HN but I haven’t found that to be true outside of tech.
Many people are very eager to see how AI can improve their lives.
I’m building a product for accountants that uses AI, initially I avoided using AI as a part of marketing copy. Turns out, people are much more interested when I tell them it uses AI behind the scenes.
It actually varies by segment. The least technical people are enthusiastic about AI. The moderately technical (e.g tech hobbyists, artists and power users) are repulsed. The segment of tech professionals are themselves divided, but I think based on how much they like/hate their career as programmers. Either way, what’s very different about this AI bubble is that the early adopters are predominantly the least technical users. This is a notable inversion of the typical adoption curve.
It's been so much more rewarding playing with AI coding tools on my own than through the subtle and not so subtle nudges at work. The work AI tools are a walled garden, have a shitty interface, feel made to extract from me than to help me. In my personal stuff, downloading models, playing with them, the tooling, the interactions, it all been so much more rewarding to give me stable comfortable workflows I can rely on and that work with my brain.
The dialog around it is so adversarial it's been hard figuring out how to proceed until dedicating a lot of effort to diving into the field myself, alone, on my personal time and learned what's comfortable to use it on.
Where he dunks on how SSH access works in VSCode. I don't know. The code and architecture behind this feature may well be bananas, but gets the work done. Sounds like a clear case of mediocority. I wonder how does he reconcile those two articles together.
For me this is more of a clickbait. Both of the articles. With that in mind, if I am nuts for being sceptical of LLMs, I think it is fair to call the author a clickbaiter.
I agree with the main take in this article: the combination of agents + LLMs with large context windows + a large budget of tokens to iterate on problems can probably already yield some impressive results.
I take serious issue with the "but you have no idea what the code is" rebuttal, since it - to me - skims over the single largest issue with applying LLMs anywhere where important decisions will be made based on their outputs.
To quote from the article:
People complain about LLM-generated code being
“probabilistic”. No it isn’t.
It’s code. It’s not Yacc output. It’s knowable. The LLM
might be stochastic. But the LLM doesn’t matter. What
matters is whether you can make sense of the result, and
whether your guardrails hold.
Reading other people’s code is part of the job. If you can’t metabolize the
boring, repetitive code an LLM generates: skills issue! How are you handling the
chaos human developers turn out on a deadline?
The problem here is that LLMs are optimized to make their outputs convincing. The issue is exactly "whether you can make sense of the result", as the author said, or, in other words: whether you're immune to being conned by a model output that sounds correct but is not. Sure, "reading other people’s code is part of the job", but the failure modes of junior engineers are easily detectable. The failure modes of LLMs are not.
It's also funny how it requires a lot of iterations for the average task.. and the user has to pay for the failures. No other product has this expectation, imagine a toaster that only toasts bread 20% of the time, and 50% it's half toasted.
> The problem here is that LLMs are optimized to make their outputs convincing.
That may be true for chat aligned LLMs, but coding LLMs are trained w/ RL and rewards for correctness, nowadays. And there are efforts to apply this to the entire stack (i.e. better software glue, automatic guardrails, more extensive tool-use, access to LSP/debuggers/linters, etc).
I think this is the critical point in a lot of these debates that seem to be very popular right now. A lot of people try something and get the wrong impressions about what SotA is. It turns out that often that something is not the best way to do it (i.e. chatting in a web interface for coding), but people don't go the extra mile to actually try what would be best for them (i.e. coding IDEs, terminal agents, etc).
Which "coding LLMs" are you referring to that are trained purely on verifiably correct synthetic data? To my understanding o3, gemini 2.5 pro, claude 3.7 sonnet, etc. are all still aligned to human preferences using a reward function learned from human feedback. Any time a notion of success/correctness is deferred to a human, the model will have a chance to "game" the system by becoming more convincing as well as more correct.
Edit: thought I would include this below instead of in a separate comment:
Also, whether the models are trained purely on synthetic data or not, they suffer from these epistemological issues where they are unable to identify what they don't know. This means a very reasonable looking piece of code might be spit out for some out-of-distribution prompt where the model doesn't generalize well.
The argument that I've heard against LLMs for code is that they create bugs that, by design, are very difficult to spot.
The LLM has one job, to make code that looks plausible. That's it. There's no logic gone into writing that bit of code. So the bugs often won't be like those a programmer makes. Instead, they can introduce a whole new class of bug that's way harder to debug.
This is exactly what I wrote about when I wrote "Copilot Induced Crash" [0]
Funny story: when I first posted that and had a couple of thousand readers, I had many comments of the type "you should just read the code carefully on review", but _nobody_ pointed out the fact that the opening example (the so called "right code") had the exact same problem as described in the article, proving exactly what you just said: it's hard to spot problems that are caused by plausibility machines.
AI generated code will fuck up so many lives. The post office software in the UK did it without AI. I cannot imagine the way and the number of lives will be ruined since some consultancy vibe coded some government system. I might come to appreciate the German bureaucracy and backwardness.
My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.
> My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.
Maybe use one LLMs to write the code and a wildly different one to write the tests and yet another wildly different one to generate an English description of each test while doing critical review.
Yes, exactly - my (admittedly very limited!) experience has consistently generated well-written, working code that just doesn’t quite do what I asked. Often the results will be close to what I expect, and the coding errors do not necessarily jump out on a first line-by-line pass, so if I didn’t have a high degree of skepticism of the generated code in the first place, I could easily just run with it.
For me it's mostly about the efficiency of the code they write. This is because I work in energy where efficiency matters because our datasets are so ridicilously large and every interface to that data is so ridicilously bad. I'd argue that for 95% of the software out there it won't really matter if you use a list or a generator in Python to iterate over data. It probably should and maybe this will change with cloud costs continious increasing, but we do also live in a world where 4chan ran on some apache server running a 10k line php file from 2015...
Anyway, this is where AI's have been really bad for us. As well as sometimes "overengineering" their bug prevention in extremely inefficient ways. The flip-side of this is of course that a lot of human programmers would make the same mistakes.
I’ve had the opposite experience. Just tell it to optimise for speed and iterate and give feedback. I’ve had JS code optimised specifically for v8 using bitwise operations. It’s brilliant.
>Instead, they can introduce a whole new class of bug that's way harder to debug
That sounds like a new opportunity for a startup that will collect hundreds of millions a of dollars, brag about how their new AI prototype is so smart that it scares them, and devliver nothing
It doesn't optimize "good programs". It interprets "humans interpretation of good programs." More accurately, "it optimizes what low paid over worked humans believe are good programs." Are you hiring your best and brightest to code review the LLMs?
Even if you do, it still optimizes tricking them. It will also optimize writing good programs, but you act like that's a well defined and measurable thing.
"Good" is the context of LLMs means "plausible". Not "correct".
If you can't code then the distinction is lost on you, but in fact the "correct" part is why programmers get paid. If "plausible" were good enough then the profession of programmer wouldn't exist.
They are also trained with RL to write code to pass unit tests and Claude does have a big problem with trying to cheat the test or request pretty quickly after running into issues, making manual edit approval more important. It usually still tells what it is trying to do wrong so you can often find out from its summary before having to scan the diff.
The argument seems to be that for an expert programmer, who is capable of reading and understanding AI agent code output and merging it into a codebase, AI agents are great.
Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
The expert skills needed to be an editor -- reading code, understanding its implications, knowing what approaches are likely to cause problems, recognizing patterns that can be refactored, knowing where likely problems lie and how to test them, holding a complex codebase in memory and knowing where to find things -- currently come from long experience writing code.
But a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?
I think of this because of my job as a professor; many of the homework assignments we use to develop thinking skills are now obsolete because LLMs can do them, permitting the students to pass without thinking. Perhaps there is another way to develop the skills, but I don't know what it is, and in the mean time I'm not sure how novices will learn to become experts.
> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
Well, if everyone uses a calculator, how do we learn math?
Basically, force students to do it by hand long enough that they understand the essentials. Introduce LLMs at a point similar to when you allow students to use a calculator.
> Well, if everyone uses a calculator, how do we learn math?
Calculators have made most people a lot worse in arithmetic. Many people, for instance, don't even grasp what a "30%" discount is. I mean other than "it's a discount" and "it's a bigger discount than 20% and lower than 40%". I have seen examples where people don't grasp that 30% is roughly one third. It's just a discount, they trust it.
GPS navigation has made most people a lot worse at reading maps or generally knowing where they are. I have multiple examples where I would say something like "well we need to go west, it's late in the day so the sun will show us west" and people would just not believe me. Or where someone would follow their GPS on their smartphone around a building to come back 10m behind where they started, without even realising that the GPS was making them walk the long way around the building.
Not sure the calculator is a good example to say "tools don't make people worse with the core knowledge".
If calculators were unreliable... Well, we'd be screwed if everyone blindly trusted them and never learned math.
They'd also be a whole lot less useful. Calculators are great because they always do exactly what you tell them. It's the same with compilers, almost: imagine if your C compiler did the right thing 99.9% of the time, but would make inexplicable errors 0.1% of the time, even on code that had previously worked correctly. And then CPython worked 99.9% of the time, except it was compiled by a C compiler working 99.9% of the time, ...
But bringing it back on-topic, in a world where software is AI-generated, and tests are AI-generated (because they're repetitive, and QA is low-status), and user complaints are all fielded by chat-bots (because that's cheaper than outsourcing), I don't see how anyone develops any expertise, or how things keep working.
While I agree with your suggestion, the comparison does not hold: calculators do not tell you which numbers to input and compute. With an LLM you can just ask vaguely, and get an often passable result
> a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?
Well, if you’re a novice, don’t do that. I learn things from LLMs all the time. I get them to solve a problem that I’m pretty sure can be solved using some API that I’m only vaguely aware of, and when they solve it, I read the code so I can understand it. Then, almost always, I pick it apart and refactor it.
Hell, just yesterday I was curious about how signals work under the hood, so I had an LLM give me a simple example, then we picked it apart. These things can be amazing tutors if you’re curious. I’m insatiably curious, so I’m learning a lot.
Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.
I agree, and it sounds like you're getting great results, but they're all going to do it. Ask anyone who grades their homework.
Heck, it's even common among expert users. Here's a study that interviewed scientists who use LLMs to assist with tasks in their research: https://doi.org/10.1145/3706598.3713668
Only a few interviewees said they read the code through to verify it does what they intend. The most common strategy was to just run the code and see if it appears to do the right thing, then declare victory. Scientific codebases rarely have unit tests, so this was purely a visual inspection of output, not any kind of verification.
> Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.
Except it's impossible to follow your curiosity when everything in the world is pushing against it (unless you are already financially independent and only programming for fun). Junior developers compete in one of the most brutal labor markets in the world, and their deliverables are more about getting things done on time than doing things better. What they "should" do goes out the window once you step out of privilege and look at the real choices.
This reminds me of Isaac Asimov's "Profession" short story. Most people receive their ability (and their matching assigned profession, thus the name) from a computer. They then are able to do the necessary tasks for their job, but they can't advance the art in any way. A few people aren't compatible with this technology, and they instead learn to do things themselves, which is fortunate because it's the only way to advance the arts.
Deliberate practice, which may take a form different from productive work.
I believe it's important for students to learn how to write data structures at some point. Red black trees, various heaps, etc. Students should write and understand these, even though almost nobody will ever implement one on the job.
Analogously electrical engineers learn how to use conservation laws and Ohm's law to compute various circuit properties. Professionals use simulation software for this most of the time, but learning the inner workings is important for students.
The same pattern is true of LLMs. Students should learn how to write code, but soon the code will write itself and professionals will be prompting models instead. In 5-10 years none of this will matter though because the models will do nearly everything.
I agree with all of this. But it's already very difficult to do even in a college setting -- to force students to get deliberate practice, without outsourcing their thinking to an LLM, you need various draconian measures.
And for many professions, true expertise only comes after years on the job, building on the foundation created by the college degree. If students graduate and immediately start using LLMs for everything, I don't know how they will progress from novice graduate to expert, unless they have the self-discipline to keep getting deliberate practice. (And that will be hard when everyone's telling them they're an idiot for not just using the LLM for everything)
You're talking about students, but the question was about seniors. You don't go to school to become a senior dev, you code in real-world settings, with real business pressures, for a decade or two to become a senior. The question is how are decent students supposed to grow into seniors who can independently evaluate AI-produced code if they are forced to use the magic box and accept its results before being able to understand them?
> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
LLMs are very much like pair programmers in my experience. For the junior engineer, they are excellent resources for learning, the way a senior engineer might be. Not only can they code what the junior can’t, they can explain questions the junior has about the code and why it’s doing what it’s doing.
For senior devs, it is a competent pair programmers, acting as an excellent resource for bouncing ideas off of, rubber ducking, writing boilerplate, and conducting code reviews.
For expert devs, it is a junior/senior dev you can offload all the trivial tasks to so you can focus on the 10% of the project that is difficult enough to require your expertise. Like a junior dev, you will need to verify what it puts together, but it’s still a huge amount of time saved.
For junior devs specifically, if they are not curious and have no interest in actually learning, they will just stop at the generated code and call it a day. That’s not an issue with the tool, it’s an issue with the dev. For competent individuals with a desire to learn and grow, LLMs represent one of the single best resources to do so. In that sense, I think that junior devs are at a greater advantage than ever before.
> That’s not an issue with the tool, it’s an issue with the dev.
Hard disagreeing here. It's a difference to work on a task because you feel it brings you tangible progress or because it's an artificial exercise that you could really do with one sentence to Claude if it weren't for the constraints of the learning environment. This feeling is actually demotivating for learning.
I don’t know about you, but I use LLMs as gateways to knowledge. I can set a deep research agent free on the internet with context about my current experience level, preferred learning format (books), what I’m trying to ramp up on, etc. A little while later, I have a collection of the definitive books for ramping up in a space. I then sit down and work through the book doing active recall and practice as I go. And I have the LLM there for Q&A while I work through concepts and “test the boundaries” of my mental models.
I’ve become faster at the novice -> experienced arc with LLMs, even in domains that I have absolutely no prior experience with.
But yeah, the people who just use LLMs for “magic oracle please tell me what do” are absolutely cooked. You can lead a horse to water, but you can’t make it drink.
Arguments are made consistently about how this can replace interns or juniors directly. Others say LLMs can help them learn to code.
Maybe, but not on your codebase or product and not with a seniors knowledge of pitfalls.
I wonder if this will be programmings iPhone moment where we start seeing a lack of deep knowledge needed to troubleshoot. I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.
> I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.
And as with Google and Stack Overflow before, the Sr Devs will smack the wrists of the Jr's that commit untested and unverified code, or said Jr's will learn not to do those things when they're woken up at 2 AM for an outage.
The way the responses to this subthread show the classical "the problem doesn't exist - ok, it does exist but it's not a big deal - ok, it is a big deal but we should just adapt to it" progression makes me wonder if we found one of the few actually genuine objections to LLM coding.
Nail on head. Before, innovations in code were extensions of a human's capabilities. The LLM-driven generation could diminish the very essence of writing meaningful code, to the point where they will live in the opposite of a golden era. The dead internet theory may yet prevail.
I think a large fraction of my programming skills come from looking through open source code bases. E.g. I'd download some code and spend some time navigating through files looking for something specific, e.g. "how is X implemented?", "what do I need to change to add Y?".
I think it works a bit like pre-training: to find what you want quickly you need to have a model of coding process, i.e. why certain files were put into certain directories, etc.
I don't think this process is incompatible with LLM use...
If I were a professor, I would make my homework start the same -- here is a problem to solve.
But instead of asking for just working code, I would create a small wrapper for a popular AI. I would insist that the student use my wrapper to create the code. They must instruct the AI how to fix any non-working code until it works. Then they have to tell my wrapper to submit the code to my annotator. Then they have to annotate every line of code as to why it is there and what it is doing.
Why my wrapper? So that you can prevent them from asking it to generate the comments, and so that you know that they had to formulate the prompts themselves.
They will still be forced to understand the code.
Then double the number of problems, because with the AI they should be 2x as productive. :)
For introductory problems, the kind we use to get students to understand a concept for the first time, the AI would likely (nearly) nail it on the first try. They wouldn't have to fix any non-working code. And annotating the code likely doesn't serve the same pedagogical purpose as writing it yourself.
Students emerge from lectures with a bunch of vague, partly contradictory, partly incorrect ideas in their head. They generally aren't aware of this and think the lecture "made sense." Then they start the homework and find they must translate those vague ideas into extremely precise code so the computer can do it -- forcing them to realize they do not understand, and forcing them to make the vague understanding concrete.
If they ask an AI to write the code for them, they don't do that. Annotating has some value, but it does not give them the experience of seeing their vague understanding run headlong into reality.
I'd expect the result to be more like what happens when you show demonstrations to students in physics classes. The demonstration is supposed to illustrate some physics concept, but studies measuring whether that improves student understanding have found no effect: https://doi.org/10.1119/1.1707018
What works is asking students to make a prediction of the demonstration's results first, then show them. Then they realize whether their understanding is right or wrong, and can ask questions to correct it.
Post-hoc rationalizing an LLM's code is like post-hoc rationalizing a physics demo. It does not test the students' internal understanding in the same way as writing the code, or predicting the results of a demo.
> They will still be forced to understand the code.
But understanding is just one part of the learning process, isn't it? I assume everybody has had this feeling: the professor explains maths on the blackboard, and the student follows. The students "understands" all the steps: they make sense, they don't feel like asking a question right now. Then the professor gives them an exercise slightly different and asks to do the same, and the students are completely lost.
Learning is a loop: you need to accept it, get it in your memory (learn stuff by heart, be it just the vocabulary to express the concepts), understand it, then try to do it yourself. Realise that you missed many things in the process, and start at the beginning: learn new things by heart, understand more, try it again.
I find these tools incredibly useful. But I constantly edit their output and frequently ask for changes to other peoples' code during review, some of which is AI generated.
But all of that editing and reviewing is informed by decades of writing code without these tools, and I don't know how I would have gotten the reps in without all that experience.
So I find myself bullish on this for myself and the experienced people I work with, but worried about training the next generation.
Yes I feel the same way. But I worry about my kids. My 15-year old son wanted to go into software engineering and work for a game studio. I think I'll advocate civil engineering, but for someone who will still be working 50 years from now its really hard to know what will be a good field right now.
They won't, save for a relative minority of those who enjoy doing things the hard way or those who see an emerging market they can capitalize on (slop scrubbers).
I wrote this post [1] last month to share my concerns about this exact problem. It's not that using AI is bad necessarily (I do every day), but it disincentivizes real learning and competency. And once using AI is normalized to the point where true learning (not just outcome seeking) becomes optional, all hell will break loose.
> Perhaps there is another way to develop the skills
Like sticking a fork in a light socket, the only way to truly learn is to try it and see what happens.
I dont know if im convinced by this. Like if we were talking about novels, you don't have to be a writer to check grammar and analyze plot structure in a passable way. It is possible to learn by reading instead of doing.
Sure, you could learn about grammar, plot structure, narrative style, etc. and become a reasonable novel critic. But imagine a novice who wants to learn to do this and has access to LLMs to answer any question about plots and style that they want. What should they do to become a good LLM-assisted author?
The answer to that question is very different from how to become an author before LLMs, and I'm not actually sure what the answer is. It's not "write lots of stories and get feedback", the conventional approach, but something new. And I doubt it's "have an LLM generate lots of stories for you", since you need more than that to develop the skill of understanding plot structures and making improvements.
So the point remains that there is a step of learning that we no longer know how to do.
I've had a lot of success using LLMs to deepen my understanding of topics. Give them an argument, and have them give the best points against it. Consider them, iterate. Argue against it and let it counter. It's a really good rubber duck
> The expert skills... currently come from long experience writing code
Do they? Is it the writing that's important? Or is it the thinking that goes along with it? What's stopping someone from going through LLM output, going back and forth on design decisions with the LLM, and ultimately making the final choice of how the tool should mold the codebase after seeing the options
I mean of course this requires some proactive effort on your part.. but it always did
The key point I think though is to not outsource your thinking. You can't blindly trust the output. It's a modern search engine
HIM: AI is going to take all entry level jobs soon.
ME: So the next level one up will become entry level?
HIM: Yes.
ME: Inductively, this can continue up to the CEO. What about the CEO?
HIM. Wait...
This is such a non issue and so far down the list of questions. Weve invented AI that can code, and you're asking about career progression? Thats the the top thing to talk about? Weve given life to essentially an alien life form
"What is this going to do to humans?" is probably the #1 question that should be on the mind of every engineer, every day. Being toolmakers for civilization is the entire point of our profession.
I'll take the opposite view of most people. Expertise is a bad thing. We should embrace technological changes that render expertise economically irrelevant with open arms.
Take a domain like US taxation. You can certainly become an expert in that, and many people do. Is it a good thing that US taxes are so complicated that we have a market demand for thousands of such experts? Most people would say no.
Don't get my wronf, I've been coding for more years of being alive than I haven't by this point, I love the craft. I still think younger me would have far preferred a world where he could have just had GPT do it all for him so he didn't need to spend his lunch hours poring over the finer points of e.g. Python iterators.
By the same logic we should allow anyone with an LLM to design ships, bridges, and airliners.
Clearly, it would be very unwise to buy a bridge designed by an LLM.
It's part of a more general problem - the engineering expectations for software development are much lower than for other professions. If your AAA game crashes, people get annoyed but no one dies. If your air traffic control system fails, you - and a large number of other poeple - are going to have a bad day.
The industry that has a kind of glib unseriousness about engineering quality - not theoretical quality, based on rules of thumb like DRY or faddy practices, but measurable reliability metrics.
The concept of reliability metrics doesn't even figure in the LLM conversation.
> We should embrace technological changes that render expertise economically irrelevant with open arms.
To use your example, is using AI to file your taxes actually "rendering [tax] expertise economically irrelevant?" Or is it just papering over the over-complicated tax system?
From the perspective of someone with access to the AI tool, you've somewhat eased the burden. But you haven't actually solved the underlying problem (with the actual solution obviously being a simpler tax code). You have, on the other hand, added an extra dependency on top of an already over-complicated system.
The question then becomes whether or not it's possible (or will be possible) to effectively use these LLMs for coding without already being an expert. Right now, building anything remotely complicated with an LLM, without scouring over every line of code generated, is not possible.
I quite often prompt with code in a different language, or pseudo-code describing roughly what I am trying to achieve, or a Python function signature without the function body.
Or I will paste in a bunch of code I have already written with a comment somewhere that says "TODO: retrieve the information from the GitHub API" and have the model finish it for me.
> a Python function signature without the function body.
This, and for multiple functions that end up composing well together as per their signatures. Maybe there's one public function I want to document well, so I write the docstring myself, and it's the result for 3-4 other internal functions which I'd let the LLM implement.
The nice part is that even if the LLM fails, all that is not lost, as opposed to some weird spec I'd feed an LLM but that's too verbose for a human reader, or a series of prompts.
No. Thanks for pointing this out. It took quite some time to understand and express this to my management. Also I use a very small set of libraries, with function names / parameters that are quite explicit. And I don't need to specify some style to bypass hidden prompts, and bad coding practices.
I am pretty sure we will see programming languages that are custom made for AI popping up soon. I cannot predict how it will look like, but it may be a mix between an extremely well documented language with lots of safeguards (like Kotlin or Java) combined with natural language like instructions.
> If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style.
The problem is that a lot of code works in general, but fails in edge cases. I would hate to be the guy who's job is only to find out why verbose AI generated code fails in one particular condition.
I read LLM generated code like I review a PR. I skim for anything that stands out as a common pitfall, and dig into the details of area I expect issues.
For most things I'm not willing to accept faster code at the expense of being an expert in the code.
So I am still trying to find the right amount of reading, editing, and reworking that gets the job done faster, where "the job" includes me being an expert in the produced code, not just the production of code.
There are periods of skimming but I'm doing a lot more than skimming.
The major difference is that in the type of reading Joel Splosky is talking about, you are coming in not knowing the code's intent. It was written by one or more other people at some point in the past, likely with many iterative changes over a period of time. Figuring out the intent in this case is 90%+ of the work. With LLM generated code, you know the intent. You just told the assistant exactly what your intent was. It's much, much easier to read code that you already know the intent of.
You can invest more time in the writing to make code easier to read. It’s just that it typically would take much more time. It would be conceivable for LLMs to eventually reach a state where they can produce such easily readable and comprehensible code.
I can reconcile that. Joel is right. So is Thomas. If you're a professional programmer you should be able to do the harder thing.
Thomas was a bit more spicy about that:
> but you have no idea what the code is - Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?
You've said this a couple of times, but I don't understand what you're trying to say.
Yes, I can do frustrating things, I know how to review and vet code. I also know how to write boilerplate code. I also know how to research new tasks in areas where I have no familiarity that are poorly documented. I know how to do a lot of complicated, difficult things - all of which are part of being a professional developer.
The question is whether I want to use a tool that makes most of my job the difficult part.
The only way this makes sense is if you are somehow reaching a point where "doing the harder thing" is not the harder thing for you anymore - where reviewing code is easier for you than writing it. And maybe your argument is that you can get to that point with practice, or that LLM code is generally easier to review than other types of code, or that the problems it's tackling are so difficult that in those cases writing the code is harder than reading it.
But it's not that they're both true. "You should be able to do the harder thing" is not really an answer to "why are you selling me a tool that replaces an easy thing with a hard thing?"
There are many difficult things that I can do as a professional software developer. I can mentor junior developers. I can do detailed product design work with stakeholders and translate technical limitations into language that they understand. I can negotiate software deadlines with product owners. I can write interfaces for undocumented software and deal with undocumented bugs in 3rd-party code. I can step through minified code in production settings to debug problems. These are all difficult things that, as a professional developer, I am capable and willing to do, and often need to do. And yes, of course, I can review pull requests. I am not, however, generally in the business of adopting tools that force me to do that stuff more often than is necessary to get good results. I don't adopt tools that make my life harder, and I honestly think that's a big part of being a professional.
To be very blunt about it: "Buck up, you should be able to handle this" is not a sales pitch. I can also write with my non-dominant hand, but I'm not going to start taking notes that way. There's no intrinsic prize for making your life harder, the only way what you're saying makes sense is if it's not harder for you to read code than to write it.
> The AI has suggested a solution, but the added code is arguably useless or wrong. There is a huge decision space to consider, but the AI tool has picked one set of decisions, without any rationale for this decision.
> [...]
> Programming is about lots of decisions, large and small. Architecture decisions. Data validation decisions. Button color decisions.
> Some decisions are inconsequential and can be safely outsourced. There is indeed a ton of boilerplate involved in software development, and writing boilerplate-heavy code involves near zero decisions.
Proponents of AI coding often talk about boilerplate as if that's what we spend most of our time on, but boilerplate is a cinch. You copy/paste, change a few fields, and maybe run a macro on it. Or you abstract it away entirely. As for the "agent" thing, typing git fetch, git commit, git rebase takes up even less of my time than boilerplate.
Most of what we write is not highly creative, but it is load-bearing, and it's full of choices. Most of our time is spent making those choices, not typing out the words. The problem isn't hallucination, it's the plain bad code that I'm going to have to rewrite. Why not just write it right myself the first time? People say "it's like a junior developer," but do they have any idea how much time I've spent trying to coax junior developers into doing things the right way rather than just doing them myself? I don't want to waste time mentoring my tools.
No, what's happening here is that you're using a different definition of "boilerplate" than the adopters are using. To you, "boilerplate" is literally a chunk of code you copy and paste to repeatedly solve a problem (btw: I flip my shit when people do this on codebases I work on). To them, "boilerplate" represents a common set of rote solutions to isomorphic problems. The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.
Coming at this from a computer-science or PLT perspective, this idea of an "abstract, repeatable meta-boilerplate" is exactly the payoff we expect from language features like strong type systems. Part of the point of rigorous languages is to create these kinds of patterns. You had total expressiveness back in assembly language! Repeatable rigor is most of the point of modern languages.
> To them, "boilerplate" represents a common set of rote solutions to isomorphic problems.
That's what libraries and frameworks are here for. And that's why no experienced engineers consider those an issue. What's truly important is the business logic, then you find a set of libraries that solves the common use cases and you write the rest. Sometimes you're in some novel space that doesn't have libraries (new programming language), but you still have specs and reference implementation that helps you out.
The actual boilerplate is when you have to write code twice because the language ecosystem don't have good macros à la lisp so you can invent some metastuff for the problem at end. (think writing routers for express.js)
Copy pasting code that could be abstracted is not a usage of boilerplate I've ever encountered, usually it's just a reference to certain verbose languages where you have to write a bunch of repetitive low-entropy stuff to get anywhere, like getters and setters in java classes.
> The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.
Assuming something like "a REST endpoint which takes a few request parameters, makes a DB query, and returns the response" fits what you're describing, you can absolutely copy/paste a similar endpoint, change the parameters and the database query, and rename a couple variables—all of which takes a matter of moments.
Naturally code that is being copy-pasted wholesale with few changes is ripe to be abstracted away, but patterns are still going to show up no matter what.
But what do you make of the parent’s second paragraph? This is the way I feel as well - I would rather not spend my time asking AI to do something right that I could just do myself.
I bit the bullet last week and tried to force myself to use a solution built end to end by AI. By the time I’d finished asking it to make changes (about 25 in total), I would’ve had a much nicer time doing it myself.
The thing in question was admittedly partially specified. It was a yaml-based testing tool for running some scenarios involving load tests before and after injecting some faults in the application. I gave it the yaml schema up front, and it did a sensible job as a first pass. But then I was in the position of reading what it wrote, seeing some implicit requirements I’d not specified, and asking for those.
Had I written it myself from the start, those implicit requirements would’ve been more natural to think about in the progression of iterating on the tool. But in this workflow, I just couldn’t get in a flow state - the process felt very unnatural, not unlike how it would’ve been to ask a junior to do it and taking 25 rounds of code review. And that has always been a miserable task, difficult to force oneself to stay engaged with. By the end I was much happier making manual tweaks and wish I’d have written it myself from the start.
I'm firmly convinced at this point that there is just no arguing with the haters. At the same time, it feels like this transition is as inevitable as the transition to mobile phones. LLM's are everywhere, and there's no escaping it no matter how much you might want to.
There's always some people that will resist to the bitter end, but I expect them to be few and far between.
At least when you mentor an actual junior developer they often learn, and you can take satisfaction in aiding the growth of a human being. All the time and effort spent coaxing an LLM to "do better" either disappears in a puff of smoke the next time it goes schizoid and needs to have its context cleared or- at best- is recorded to help a for-profit company train their next generation of products.
Like everything else about the "GenAI" fad, it boils down to extractively exploiting goodwill and despoiling the commons in order to convert VC dollars into penny-shavings.
Boilerplate is a cinch when you already know what to do.
I work in finance, I have for almost 20 years now. There are things in finance you do once every 5 years, like setting up a data source like Bloomberg in a new programming language. Now you know from the last time you did it that it’s a pain, you need to use a very low level api, handling all the tiny messages yourself, building up the response as it comes from the source in unordered packets. It’s asynchronous, there is a message queue, and what I specialize in is maths.
Now I could spend hours reading documents, putting crap together, and finally come up with some half baked code that ignores most possible error points.
Or I could use ChatGPT and leverage the fact that hundreds of implementations of the same module exist out there. And make something that just works.
That is the first ever coding question I asked an LLM and it literally saved me days of trial and error for something where my added value is next to zero.
Similarly I use LLMs a lot for small tasks that are in fact fairly difficult, and that don’t add any value to the solution. Things like converting data structures in an efficient way using Python idioms, or JavaScript 2023 features, that there is no way I can keep up with.
The thing that makes an agent special is making some kind of effort to gather the relevant context before generating. The quality of the edits from the "agent" panel in Cursor/Copilot/etc is quite a bit higher than the type-ahead suggestions or the inline prompt stuff.
Bizarrely though, it seems to be limited to grep for the moment, doesn't work with LSP yet.
Plus, looks like it just hard coded vales. I see this happen a lot with AI code. Even when I try to get it to not it still tends to do it.
Issues like that are simple and just create debt. Sure, it "works" now but who writes code not knowing that we're going to change things next week or next month. It's the whole reason we use objects and functions in the first place!
Yeah, only in Rust is the maximum value of an unsigned 16-bit integer 65535.
These aren't Rust-specific syntax foibles. It's not a borrow-checker mistake or anything. These are basic CS fundamentals that it's thoughtlessly fumbling.
> Most of what we write is not highly creative, but it is load-bearing, and it's full of choices.
The idea that you can't specify the load bearing pillars of your structure to the AI, or that it couldn't figure them out by specifying the right requirements/constraints, will not age well.
> The idea that you can't specify the load bearing pillars of your structure to the AI
But English is a subjective and fuzzy language, and the AI typically can't intuit the more subtle points of what you need. In my experience a model's output always needs further prompting. If only there were a formal, rigorous language to express business logic in! Some sort of "programming language."
One of the biggest anti LLM arguments for me at the moments is about security. In case you don't know, if you open a file with copilot active or cursor, containing secrets, it might be sent to a server a thus get leaked. The companies say that if that file is in a cursorignore file, it won't be indexed, but it's still a critical security issue IMO. We all know what happened with the "smart home assistants" like Alexa.
Sure, there might be a way to change your workflow and never ever open a secret file with those editors, but my point is that a software that sends your data without your consent, and without giving you the tools to audit it, is a no go for many companies, including mine.
It's why I use Aider, because it only operates on explicit files that you give it. Works great with OpenAI but if you are really worried, it interfaces perfectly with Ollama for local LLMs. A 12b model on my Mac does well enough for coding that it's serviceable for me.
At day job while someone was trying out windsurf, it simply picked up an environment variable that contained sensitive data and used it in code. This is wild.
Sometimes I wonder if all the hype around being left behind and needing to try these things just to see how great they can be is being deliberately put out there to increase their training data.
Too many vibe coders contribute trash code if any. They need more code from so called experts that isn't open source yet.
It's pretty unlikely someone at Cursor cares about accessing your Spring Boot project on GitHub through your personal access token – because they already have all your code.
I don't think that's the threat model here. The concern is regarding potentially sensitive information being sent to a third-party system without being able to audit which information is actually sent or what is done with it.
So, for example, if your local `.env` is inadvertently sent to Cursor and it's persisted on their end (which you can't verify one way or the other), an attacker targeting Cursor's infrastructure could potentially compromise it.
The problem with LLMs for code is that they are still way too slow and expensive to be generally practical for non-trivial software projects. I'm not saying that they aren't useful, they are excellent at filling out narrow code units that don't require a lot of context and can be quickly or automatically verified to be correct. You will save a lot of time using them this way.
On the other hand, if you slip up and give it too much to chew on or just roll bad RNG, it will spin itself into a loop attempting many variations of crap, erasing and trying over, but never actually coming closer to a correct solution, eventually repeating obviously incorrect solutions over and over again that should have been precluded based on feedback from the previous failed solutions. If you're using a SOTA model, you can easily rack up $5 or more on a single task if you give it more than 30 minutes of leeway to work it out. Sure, you could use a cheaper model, but all that does is make the fundamental problem worse - i.e. you're spending money but not actually getting any closer to completed work.
Yes, the models are getting smarter and more efficient, but we're still at least a decade away from being able to run useful models at practical speeds locally. Aggressively quantized 70b models simply can't cut it, and even then, you need something like 10k tps to start building LLM tools that can overcome the LLM's lack of reasoning skills through brute force guess and check techniques.
Perhaps some of the AI skeptics are a bit too harsh, but they're certainly not crazy in the context of breathless hype.
The reason that I personally don't use LLMs was not addressed by the article:
I haven't found a way to use it that makes me develop faster.
The articles talks about "tedious code." If you need to generate a large static value table or something, then OK an LLM might give you a really fast result and cut through the tedium. Most of us were already writing short scripts to do that. I'm open to the possibility that an LLM can do it faster. But it's such a rare requirement that the productivity gains are truly negligible here even if they can. And in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.
The article then mentions scaffolding. Things like "bookkeeping" when it comes to creating and setting up a new repo (whatever he means by that). This is why I have, historically, been a big fan of frameworks and generators. Point being, this is already a solved problem and I haven't found a way to further improve the state of this world with LLMs. LLMs might be an alternate tool that work just as well. But they haven't made my existing daily workflow any faster. Setting up new repos is also something that is done so rarely that even if an LLM netted a 100% increase in efficiency, it wouldn't really impact much.
I am an AI "skeptic" but I'm not a naysayer. I do use LLMs regularly. I just don't use them for developing code because I have yet to find a problem that they solve for me. Don't get me wrong, there are problems that they can solve... I just haven't come across any solutions to previously-unsolved problems. Meaning I can swap an existing solution for an LLM-based one... and it is a valid solution... but I don't observe any increase in productivity from doing so. The existing solution was already working fine.
I am genuinely looking forward to the day when this changes. When I identify a single existing problem without an existing solution that LLMs solve for me when developing software. I just have yet to come across one.
>I haven't found a way to use it that makes me develop faster.
I have pycharm pro with the ai. I can right click a function and there's 12 things AI can just do for you.
"find problems" where the linter has no concerns and it finds problems that dont show up from linter are shockingly good. This is like "Hey, you need to make sure this function is only running once at a time or it can easily overwhelm the cpu."
"write documentation" and it just fills in the documentation. I havent a clue what it writes and I wont be doing it.
"generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.
"explain code" look I wrote the code years ago half drunk, half high. No comments. It explains it all in depth and perfectly. Comments sometimes go stale after bug fixes and then make things confusing. Comments arent important when you have this beautifulness.
Lets go back to that cpu DOS attack example. I then can prompt what is claude 3.5 sonnet to give me to solution to fix it. It gives me this beautiful cache option. I could just hit apply and it'd be done. But then i said it's too complicated give me something simpler. It then gives me a solution that is simpler.
>nd in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.
Its all about the boring busy work crap that doesnt need to be done that now gets done.
> "generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.
This statement worries me for a number of reasons.
First, I work on a very massive codebase, with a large engineering organization. And I have seen a number of AI generated unit tests. I have not seen a single example of an LLM-generated unit test that didn't contain a number of test anti-patterns. To the extent where I would argue that they are creating a self-fulfilling prophecy. You said you think that unit tests are a waste of time. I would argue that they CAN be even worse than that.
The benefit of unit tests is that, at their best, they give you a safety net for refactoring existing code. If you change the implementation of a system under test, and the tests pass, you know you didn't introduce a breaking change.
But a lot of tests couple themselves tightly to the implementation details. Every single LLM-generated unit test I have observed in the wild introduces this anti-pattern. If you have a system under test, and changing the implementation of that system without breaking its behaviour causes a test to fail... that's called the "Fragile Test" problem. Now your unit test is not only failing to encourage you to refactor code... it's actively DISCOURAGING you from refactoring code. In this case, the unit test is providing DISVALUE rather than value.
So the fact that a) you think unit tests are a waste of time and b) you look at AI as a way to save you from a chore ... tells me that you have no business ever writing uint tests with or without AI. Please stop. You are making the world worse by leveraging an LLM to do these things for you.
I have NEVER looked at writing a unit test by hand as a "chore" or as a "waste of time." I often write my tests before even writing my implementation code, because doing so helps me think through both the design and requirements of my code... and gives me a little mini-sandbox context where I can make sure that the brand new code that I am writing is doing what I want it to. It's a problem solving tool. Not something to be done after the fact as a chore.
"Write Documentation" is not writing code. And if you don't read the documentation you're generating, no one else will. So what's the point of having it in the first place if no one reads it? Food for thought.
"Find problems" I see as being akin to a spell check, or the red squigglies when you have a syntax error. I do see the value of having ML tech within the internal IDE implementation. We've been using ML for email SPAM filters for decades. There are interesting problems that can be solved there. But this is an area where I want my IDE to flag something for me. It's not me actively prompting an LLM to generate code for me.
The privacy aspect and other security risks tho? So far all the praise I hear on productivity are from people using cloud-hosted models.
Claude, Gemini, Copilot and and ChatGPT are non-starters for privacy-minded folks.
So far, local experiements with agents have left me underwhelmed. Tried everything on ollama that can run on my dedicated Ryzen 8700G with 96GB DDR5. I'm ready to blow ~10-15k USD on a better rig if I see value in it but if I extrapolate current results I believe it'll be another CPU generation before I can expect positive productivity output from properly securely running local models when factoring in the setup and meta.
> ...have policies saying that they will not train on your input if you are a paying customer.
Those policies are worth the paper they're printed on.
I also note that if you're a USian, you've almost certainly been required to surrender your right to air grievances in court and submit to mandatory binding arbitration for any conflict resolution that one would have used the courts for.
This is probably the biggest danger. Everyone is assuming optimization work reduces cost faster than these companies burn through capital. I'm half inclined to assume optimization work will do it, but it's far from as obvious as they want to portray it.
MacStudio with 512GB RAM starts at around 10k and quantized DeepSeek R1 671B needs around 400GB RAM, making it usable for your needs. It produced some outstanding code on many tasks I tried (some not so outstanding as well).
Am I right in assuming that running Linux (or anything else than macOS) on the MacStudio is experimental at best?
I'd be looking for something that can run offline and receive system updates from an internal mirror on the airgapped network. Needing to tie an AppleID to the machine and allow it internet access for
OS updates is a hard sell. Am I wrong in thinking that keeping an airgapped macOS installation up to date would additional infrastructure that requires some enterprise contract with Apple?
My 5 cents would be that LLMs have replaced all those random (e.g. CSS, regex etc) generators, emmet-like IDE code completion/generator tools, as well as having to google for arbitrary code snippets which you'd just copy and paste in.
In no way can AI be used for anything larger than generating singular functions or anything that would require writing to or modifying multiple files.
Technically you might be able to pull off having AI change multiple files for you in one go, but you'll quickly run into sort of "Adobe Dreamviewer" type of issue where your codebase is dominated by generated code which only the AI that generated it is able to properly extend and modify.
I remember when Dreamviewer was a thing, but you essentialyl had to make a choice between sticking with it forever for the project or not using it at all, because it would basically convert your source code into it's own proprietary format due to it becoming so horribly messy and unreadable.
Regardless, AI is absolutely incredible and speeds up development by a great deal, (even) if you only use it to generate small snippets at the time.
AI is also an absolute godsend for formatting and converting stuff from anything and to anything - you could e.g. dump your whole database structure to Gemini and ask it to generate an API against it; big task, but since it is basically just a conversion task, it will work very well.
Here are two routine problems I have to solve at the moment. Can any of the current LLM systems do either?
1. Input is an 256x256 pixel elevation map stored as a greyscale .png file, and a minimum and maximum elevation. A pixel value of 0 corresponds to the minimum elevation, and a pixel value of 255 corresponds to the maximum elevation. Read in the .png file and the elevation limits. Then construct a 256x256 floating point array of heights. From that array, construct a triangle mesh with X and Y dimensions 0..255. Perform a mesh reduction operation on the triangle mesh to reduce the number of triangles. Mesh reduction must not generate holes in the mesh. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.
2. Given four glTF files constructed as above, corresponding to four quadrants of a larger square, construct a single 511x511 mesh which combines all four input meshes to cover a larger area.
Because the input meshes are 0..255, not 0..256, there will be gaps where the four quadrants meet. Fill those gaps with reasonable triangles. Perform a mesh reduction as above. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.
Rust code is preferred; Python code is acceptable.
Tackling 1: No current LLM will be able to do that. Create your Rust project, figure out the dependencies that could be useful on your own, then start by asking which types can be expressed. Then after you have revised the types, figure out (with or without LLM) which functions have to be implemented. Then ask LLMs to implement these functions with tests. With some modifications, you should be able to have a final working solution.
> Tackling 1: No current LLM will be able to do that.
Github Copilot seems to be able to do it. Mostly because it knows how to call Meshlab. The documentation for Meshlab is hard to search, and having an LLM find the correct function is itself helpful.
Also, it looks like quadric decimation has improved since the last time I looked at it. That algorithm, if pushed too hard, will generate meshes with holes. Now there are options ("preservetopology") which supposedly prevent that. I last looked at that a few years ago, and the only mesh decimator which handled that correctly was the one inside Unreal Engine Editor.
I tried Github Copilot, free mode, on #1. The Python code outsourced all the hard work to numpy and pymeshlab, which is fine. Copilot wrote code to generate triangles. It's a reasonable job. Copilot's knowledge of what to call is better than mine.
I have to try it now, but it did OK.
On #2, it set up the problem, but bailed on the gap filling part with
# (This is a complex step; see note at bottom.)
That's not surprising, since I didn't tell it how to solve the problem. Can any of the premium systems do #2?
Supposedly, you know something about this problem. If you use an LLM to craft a solution, that's probably fine since you'd know how well you did.
I know nothing about this problem. I could use an LLM to delude myself into thinking that I do. Should my boss be very happy about that? They probably shouldn't
Weird to claim the llm does all the boring learning and boilerplate for you as a selling point, but then also insist we still need to responsibly read all the output, and if you can't understand it's a "skill issue".
Also the emphasis on greenfield projects? Starting is by FAR the easiest part. That's not impressive to me. When do we get to code greenfield for important systems? Reminds me of the equally absurd example of language choice. You think you get to choose? What?
Imagine all the code these agents are going to pump out that can never be reviewed in a reasonable time frame. The noise generated at the whim of bike-shedding vibe coders is going to drown all the senior reviewers soon enough. I'll call that Cowboy Coders on Steroids. Anyone with skills will be buried in reviews, won't have time for anything else, and I predict stricter code gen policies to compensate.
I don't have a huge dog in this fight apart from AI advocates being annoying... but I would say that for greenfield projects the interesting thing is that I can get a skeleton of a working iOS app for something simple in like an hour of some copy/pasting stuff from ChatGPT. Instead of spending a good amount of time trying to get through learning material to do it.
It's nice to build throwaway things _so fast_, especially in the sort of fuzzy stuff like frontend where it's fine for it to be completely wrong. And then I can just use my own sense of how computers work to fix up what I care about, delete a bunch of what I don't care about... It's pretty amazing.
For existing projects I have only witnessed garbage output. I know people have success. I haven't seen it.
I have witnessed PMs taking a bullet pointed list of requirements and then use ChatGPT to generate paragraphs of text for some reason. You had the list!
This is just obviously not true. I had a full-time job of reviewing code for roughly 15 years and it was never true, but it's also just intuitively not true that engineers spend 10 hours reviewing their peers code to every 1 they spend writing it.
What you mean to claim here is that verification is 10x harder than authorship. That's true, but unhelpful to skeptics, because LLMs are extremely useful for verification.
I really don't think that's true at all. Anyone here can read and sign off on 1000 lines of code faster than they can write the same 1000 lines of code, pretty much no matter what the code is.
The idea is that people will spend 10x more time reading your code in all future time stacked together. Not that reading and understanding your code once takes 10x the effort of writing it, which is obviously untrue.
Here is the quote from Clean Code, where this idea seems to originate from:
> Indeed, the ratio of time spent reading versus writing code is well over 10 to 1.
I think the point is that the person orchestrating the agent(s) reviews the code. It doesn't make sense to have 5 Juniors using agents to submit PRs and a senior or two reviewing it all. You just have the senior(s) orchestrating agents and reviewing it themselves. Maybe one or two juniors because we still need to train new devs, but maybe the junior doesn't even use an LLM. Maybe the junior writes code manually so they can learn programming properly before they start being an "AI lead".
Everyone is still responsible for the code they produce. I review my own PRs before I expect others to, and I don't even use AI. I think what the article describes seems interesting though.
Yeah, I remember five years ago (before being sold this latest VC crock of bull) when reviewing shitty thoughtless code was the tedious and soul-sucking part of programming.
This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
It was obviously huge. You could see it taking off. Yet a lot of people proudly displayed ignorance and backed each other up on it to the point that discussion around the topic was often drowned out by the opposition to change. Now today it takes minutes of playing with ai coding agents to realise that it’s extremely useful and going to be similarly huge.
I still think smartphones are a huge negative to humanity. They improve a narrow case: having access to ephemeral knowledge. Nobody writes articles or does deep knowledge work with smartphones.
My position with the AI is almost the same. It is overall a net negative for cognitive abilities of people. Moreover I do think all AI companies need to pay fair licensing cost to all authors and train their models to accurately cite the sources. If they want more data for free, they need to propose copyright changes retroactively invalidating everything older than 50 years and also do the legwork for limiting software IP to 5 to 10 years.
Agreed. Smartphones are portable, mobile computers that suck at every single aspect of being, and working as, a computer, except for mobility. They are not powerful, they are not general purpose, they are not ergonimic, they do not respect user freedom or privacy. Use such a mobile device only when you can't avoid it (i.e., when you are on the road -- when mobility is the single most important priority), and at no other time.
"Nobody writes articles or does deep knowledge work with smartphones."
I don't think that's true.
I do most of my reading on a smart phone - including wading through academic papers, or reading full books in the kindle app and jotting down notes in the digital margins.
A sizable number of my short form blog entries are written on my phone, and my long form writing almost always starts out in Apple Notes on my phone before transferring to a laptop.
Predictive text and voice dictation has got good enough now that I suspect there have been entire books written on mobile devices.
> It is overall a net negative for cognitive abilities of people.
I agree. A bunch of us here might use it to scaffold applications we already understand, use it as a rubber duck to help understand and solve new problems, research more effectively, or otherwise magnify skills and knowledge we already have in a manner that's directed towards improving and growing.
That's cool. That's also not what most people will do with it. A bunch of us are total nerds, but most of the world really isn't like that. They want more entertainment, they want problems solved for them, they want ease. AI could allow a lot of people to use their brains less and lose function far more. For the minority among us who use it to do more and learn more, great. That group is a tiny minority from what I can tell.
Take for example that a huge use case for generative AI is just... More sophisticated meme images. I see so much of that, and I'm really not looking for it. It's such an insane waste of cycles. But it's what the average person wants.
Smartphones were financially viable from day 1, though. I think LLMs will be used a lot and in a lot of places but the current level of investment they're getting right now feels out of line to me. Kind of like what I expect them to get in 10 years from now, when they're mature.
> Smartphones were financially viable from day 1, though.
Also, they were nothing more than the combination of two things which already existed, and which were already successful and financially viable: cellular phones and PDAs. In fact, I (and plenty of others I presume) already used that combination before smartphones: a PDA and a cellular phone, connected through Bluetooth to share the network connection.
Some days i still don't understand why anyone would want a smart phone. I think being connected all the time has a significant negative impact on mental health (I say, as i type this from a smartphone)
> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
I am in my 40s; have never owned a smartphone and still can't imagine wanting one; am constantly surrounded by others who have them and thus am completely aware of their functionality; AMA.
> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
This is a funny example because the devastating effects of smartphone addiction on society are now blatantly obvious. There are in fact very good reasons for not 'wanting such a thing'.
Don't get me wrong, LLMs can be incredibly useful and I think they deserve some of the current hype. Claiming that LLMs are useless is indeed silly and can be rightfully ignored. But there are serious concerns about potentional (or actual) negative effects on society, and these should be taken seriously.
To be frank, I do think smartphones have made my life worse. I'd happily forego them if it were not for 2FA and how too many businesses expect I can receive texts.
The one main claim the article makes: Senior developers should not ignore the productivity gains from LLMs.
Best use of evidence is deductive: Lots of code is tedious and uninteresting -> LLMs are fast at generating lots of tedious code -> LLMs help productivity.
Weakest part of the argument: The list of rebuttals doesn't have an obvious organization to it. What exactly is the main argument they're arguing against?
It's not stated outright but because the post is bookended by references to 'those smarter than me', I think this is an argument against the shaming of developers using (and loving) LLM tools.
Which I think is fair.
Overall, the post did not add anything to the general discussion. But the popularity of the author (and fly.io posts) may make it a beacon for some.
> An LLM can be instructed to just figure all that shit out. Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better.
Well, except that in order to fix that 1% you'd need to read and understand whatever the LLM did and then look for that 1%. I get the shills just thinking about this, whether the original programmer was human or not. I'd rather just write everything myself to begin with.
This is a lie, unless you are just speaking of the mechanics of cursorily skimming it. Reading code to understand it enough to commit it in my name is harder and slower than writing it myself.
On 'just read the code' - all well and good. Of course this implies insisting on team members who can read the code, will read the code and are empowered to read the code.
Otherwise orgs will try to hire pure vibe coders who aren't interested in that and only allow time for literally just instructing agents to generate code because it sounds cheaper and execs don't understand the nuance so long as it looks like product is shipping - until it all blows up and the one standing senior developer on hand is supposed to fix a prod issue buried in millions of lines of vibe reviewed code ASAP.
On 'but it's cheaper than a junior': cloud hosted LLM systems are currently massively subsidised to an absurd degree. The cost side of things is all smoke and mirrors geared towards accelerated market adoption at all costs. It's not a profitable enterprise at the model development level.
At some point that AI economy is going to expect to make that money back, and future (especially near-future) hardware advancements don't explain where all of that is going to come from.
Let’s not conflate LLM’s with AI. LLM’s are a kind of AI that can be a software engineer’s assistant, at best. But the degree of hype is undermining belief in AI among many professionals who the hypesters claim are going to be replaced. No, this iteration is not going to replace doctors or engineers. But the degree to which the hype train wants to do so is alarming.
Have you tried creating your own programming language? How about solving unsolved frontier problems in mathematics? Ever written a book that won a Pulitzer prize? How many languages do you know?
As someone who was born ambitious I find this technology tepid at best.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.
You still need to read this tedious code to verify that it actually does what you want it to do. Given this, I'd much rather prefer to write the tedious code myself than having to make sense of someone else's tedious code.
I've reproduced this type of sentiment over several of my hobby projects and it usually hits hard toward the latter half of the project and its usually related to bugs. In the beginning it's all great and fast with LLMs, but then there's a kind of steep drop in motivation as one realizes one has to dig as deep anyway into the tedious stuff, because there are always bugs.
I'm working with python and it does not work very well, LLMs usually generates at least an order of magnitude more code than what I would write.
That code often uses outdated practices, does poor design choices and does not understand hints like writing code in a way that reduce cognitive load, even when we explain that it means, i.e. keep the number of classes and functions small.
It's the complete opposite of OP's main language GO, as he says:
> Go has just enough type safety, an extensive standard library, and a culture that prizes (often repetitive) idiom. LLMs kick ass generating it.
Python is an interpreted dynamically typed language and the static type checkers are not there yet (most popular 3rd parties libraries have no type hints for example).
Also it allows for many different programming styles that the LLMs struggle to choose from.
1. Every extra line of code is much more risky.
2. It's much harder to verify the LLM's code.
On the other hand I think rust will be in a good place in regards to LLMs in the next few years thanks to the robustness of the language and the quality of its diagnostic messages. Those 2 attributes should compound very well.
I use Cursor with Python and it works great. Things you should try:
* use type hints everywhere.
* use dataclasses everywhere.
* add the dataclass files to your prompt.
* if the produced code is too complex, ask the LLM "Is this the simplest and cleanest way to solve this?"
Question, how familiar are you with the used technologies? My experience on where Ai have been useful so far is things I don't have a good understanding on but, when I do, its a different ball game, mostly because coding it directly seems faster since I know exactly the behaviour I am looking for and I am not having to deal with unintended consequences.
I see it as the Alice cat thing, when you don't know where you going, any road will take you there. So its been great for exploratory work and prototyping.
Yeah, I'm very familiar with the tech, I've been interested in games dev and web dev for a few decades now. So you could be right, that the models aren't ready to "play on their own" yet.
I tried doing a warcraft 1 clone, but that felt too complex for the model being used (openai 4.1). That model was just the default setting in copilot.
I dug a little deeper this morning, and it turns out I hadn't actually enabled my copilot 'pro' mode, which has granted access to some more current or dev focused models. So I'll take them for a spin to see what they're capable of.
My goal here is to roughly get a sense for when a task is too complex for an "agent" to handle.
I also want to try adding custom tools to suit certain project needs. For example, Unreal Engine has a python editor interface, so I'd like to have an agent drive the editor to build something. I have my doubts.
Once I have a feeling for what level of complexity can be handled, I'll see if I can manage the tools better using this understanding, by breaking large and complex projects into appropriate chunks of work / complexity.
I will. I'm just trying to evaluate the tools on a medium size task. I'm trying copilot's agent mode with the goal of creating a warcraft 1 clone. Or something resembling it.
I think the key premise here is that one can effectively and efficiently audit code that the LLM is producing.
I doubt that. First, human attention and speed is very limited. Second, when I see something, I am already predisposed to assume that it is right (or at the very least, my subsequent inquiries are extremely narrow and anchored around the solution I have seen presented to me.)
Yeah. Reviewing code thoroughly is extremely time consuming. When you review code from a human, you can look for choices they made fairly quickly - say they used framework X and language feature Y. Most importantly, you assume that they’ve checked certain things to work. This way, code review can be fast but it still isn’t thorough on its own. Most of it is trust and bureaucracy (big companies do this also to prevent malicious employees from smuggling in backdoors etc).
Code from LLMs that looks right, clean and even clever poses as competence but are prone to hallucinations and business logic errors. In the short term, these changes will pass through due to their appearance but contain more issues than a human would have with the same code. In the medium term, we just lose that signal - the assumptions we can make about the authors state of mind and comprehension. It’s already incredibly hard to distinguish solid points from nonsense, when the nonsense is laundered by an LLM.
Yeah the problem is the fatigue of repeatedly reviewing code.
You do a few iterations until code runs, review carefully but notice a bug. So you do another iteration and 40% of code changes. Now you need to review again but you need to understand how the changes fit in.
Repeat this a few times and it becomes very tiring.
Ultimately you can't trust them not to do stupid shit. Your tests fail and you tell it to stop that? Sure, we can just catch those exceptions and the tests pass, etc. You get pissed off an tell it to FIX the CODE so the tests pass and the cycle continues.
It's like working with a potentially gifted moron.
Exactly. And even if LLMs become more accurate and qualitative better, it just raises the costs for validation. Errors and bad design decisions become harder to spot.
A lot of people who are wary of LLMs aren’t against the technology itself, but rather the narrative surrounding it. You can take advantage of the tool while turning a blind eye to the discourse.
This 16-minute, expletive-filled, edgy-old-man-trying-too-hard-to-be-cool article could easily be dismissed as yet another AI creed that somehow found its way to the top of the HN front page.
When I made a mid career change to Cobol programming in 1981 my first manager (a suit as programmers referred to them then) pointed to a book on his desk; the title,as I can best recall was Programming Without Programmers. He "You got in too late," he said. I retired from programming in 2010. The hype has a long history. I hope I'm around long enough to see how this plays out.
"Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged."
I would jump off a bridge before I accepted that as my full-time job.
I've been programming for 20+ years and I've never wanted to move into management. I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with. I've been in a lead role, and I certainly do lots of code review and enjoy helping teammates grow. But the last fucking thing I want to do is delegate all the code writing to someone or something else.
I like writing code. Yes, sometimes writing code is tedious, or frustrating. Sometimes it's yak-shaving. Sometimes it's Googling. Very often, it's debugging. I'm happy to have AI help me with some of that drudgery, but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers.
I get up in the morning to make things, not to watch others make things.
Maybe the kind of software engineering role I love is going to disappear, like stevedores and lamplighters. I will miss it dearly, but at least I guess I got a couple of good decades out of it. If this is what the job turns into, I'll have to find something else to do with my remaining years.
> But the last fucking thing I want to do is delegate all the code writing to someone or something else
I talked about this a few days ago with coworkers and I phrased it almost exactly this way. I want to be an IC. I want to do the work. Becoming a manager-of-one to funnel all my work through would deprive me of genuinely my greatest joy in life.
I'm envious you managed a few good decades. I'm almost eight years into what has been my dream profession and to see indications it's going to morph from writing code into telling a robot to write code is just so demoralizing. I could have done this for decades. I wanted to do this for decades.
That snippit you quoted sounds like what some aristocratic fop that has never worked a day in their life creative writes while they are LARP'ing what they think the poor's ideal workday would sound like in an AI sales pitch meeting.
It rings about as authentic as "everybody stood up and clapped"
You say that, but I hear AI folks talk about that user experience all the time.
And at some level, it makes sense. There are a lot of extroverts out there for whom the ideal job really is feeling like an orchestra conductor delegating and coordination an army of others. There is a great feeling of empowerment in watching a group build something bigger than you could have made on your own. And, if you're not someone who really likes to get their hands dirty and do the thing yourself, why not aspire towards a style of working that always feels that high level and powerful?
And, obviously, people who are wired that way are exactly the kind of people to move into management and executive roles and end up being in charge of and evangelizing AI to the world. So it's no surprise you hear people talking about this style of working all the time.
I'm 47. I've had the opportunity to move into management many times. I've been in a lead role enough to know the positive sides of that job and I deeply enjoy the soft skills part of working with a team.
I just know who I am and know that on days where I don't feel like I really made something concrete myself, I feel like I didn't live a real day.
I have a friend and coworker who is currently struggling with this thing and similar "not making anything" issues from having worked a "maintenance/support" job consisting of little day-to-day make-creation. He took a leave for two months. I really don't think he's coming back.
It is a loss for people he works with and the things he touches. Not to mention a loss for me, because he was the one person I found at my current job with the "make stuff, be useful" attitude. Already people who see that writing on the wall are dropping out.
Sooner or later, me too, if that is in fact the way it goes. The only thing that really keeps me going is that I don't have anything else to turn to, and I do have some cloudy optimism about getting a lot of money and satisfaction in a couple years to help clean up or rework the garbage-deluge.
Your other comment about extroversion makes me think of Gifts Differing (not the best book in the universe, but some good bits especially around intro/extroversion). I just want to hit flow state and smash my head into a problem that costs me sleep and come out on top bloodied but happy and feeling like I've done something. And this is/was a really great career for that.
> This faculty of concentration is likely to characterize the introverts' careers. Whereas extraverts tend to broaden the sphere of their work, to present their products early (and often) to the world, to make themselves known to a wide circle, and to multiply relationships and activities, the introvert takes the opposite approach. Going more deeply into their work, introverts are reluctant to call it finished and publish it, and when they do, they tend to give only their conclusions, without the details of what they did. This impersonal brevity of communication narrows their audience and fame, but saves them from overwhelming external demands and allows them to return to another uninterrupted stretch of work. As Jung is reported to have said, the introverts' activity thereby gains in depth and their labor has lasting value.
For me switching the career, after spending more than 20 years in this as well... is very hard. I spent all my career outside of high-pay places like SV telling myself "I have all the time in the world, I don't need to grab most amount of money as soon as possible", so retiring is not an option.
So, switch to what? Any well-paid profession is going to be under pressure to be LLMized as much as possible.
> I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with
oh finally someone else who didn't enter programming because, as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.
> But the last fucking thing I want to do is delegate all the code writing to someone or something else
Thank God there is at least one other person that understands that the ratio between creative and reactive work is crucial for wellbeing at the job.
For crying out loud.
> but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers
so am I.
> but at least I guess I got a couple of good decades out of it
Thanks for this perspective. Yes, at least we've got our memories, and the code locations and commits we recall from memory, from a distance of 10 or more years.
>. If this is what the job turns into, I'll have to find something else to do with my remaining years
> as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.
Some of my fondest childhood memories are sitting in my school's resource center in front of a TRS-80, laboriously typing in some mimeographed BASIC code while wondering, "Is this the most efficient way I can increase shareholder value for the corporation?"
I'm with you two. I can work on boring problems in boring domains and enjoy the design and implementation aspect because it's inherently creative. Take away those and now my only job is the guy in Office Space who takes the requirements from the customers to the developers, if I'm lucky enough to have one at that point.
I don't want to have to change careers, as this is one that I've been working towards to some degree since I was a child. Including some intense work in college and some brutal first couple jobs where I worked hard to pick up marketable skills. Obviously the market doesn't care what I want, but I find the author of this piece to be a bit too flippant in his "but they take-rr jerbs" section. Working hard to get a well paying job only to have to start (likely near the bottom) in another career for much less pay is not something to treat lightly.
> I get up in the morning to make things, not to watch others make things.
If you ever followed down the rabbit hole of children content on YouTube, the one that infants and toddlers end up after an hour of autoplay with some innocent cartoon at the beginning, you'd find among the piles of morbid psychotic trash a variety of videos of children watching someone else playing some games, a whole class of videos in itself.
I can't explain this with anything else but a coordinated effort to flash the norm of watching someone/something doing something as opposed to being engaged in immediate action yourself into the firmware of future people. This lines up with many other current trends pretty well. Talk about incrementing Baudrillard's level of simulation a notch up.
- all creatives are flat against it because it’s destroying their income streams and outright stealing their intellectual property
- some technical leaders are skeptical because early returns were very bad and they have not updated their investigations to the latest tools and models, which are already significantly ahead of even six months ago
- a tech concern is how do we mentor new developers if they don’t know how to code or develop logic. LLMs are great IF you already know what you’re doing
- talent is deeply concerned that they will be reduced and replaced, going from high paying careers to fast food salaries
We have a lot of work to balance productivity with the benefits to society. “Let them eat cake,” is not going to work this time either.
I tend to agree with the gist of this, namely that the change is here and "AI" presents some huge potential to save me from having to do the tedious stuff. But we do need a bit of a sanity check. I'm in the middle of trying to convince any of the AIs that I have access to to write me some simple Lua. Specifically, I'm trying to write a query that returns a certain subset of pages from the Silverbullet V2 note tool. This isn't particularly difficult, but it's become this hilarious journey to try to get the AI to figure this problem out. Silverbullet is a niche use case, v2 even more so, but a reasonably skilled developer could peruse the API documentation and come up with a pretty good starting point. AI tools? Absolutely insane wrong answers. I finally specifically asked one or two of them not to guess with their answers and they just straight up said "nah, we don't know how to do this."
Point being: there's some real power in these tools, but if the ground is not already well-trodden, they risk creating a huge time sink that could be better served just learning to do the thing yourself.
Not the ones I’ve tried. They make a go of it but their results have been generally inferior, often wildly so. Because they can’t make intuitive connections as easily, I assume they need both the rules and examples of implementations to produce workable output.
Seeing everything these days being about vibe coding, I feel a little old with my VIM setup and my LSP servers who I already thought were a nice productivity increase.
The problems I have with the stuff relating to MCP is that the tech around it is developing so fast that it's hard for outsiders to catch up with what the best working setup is, for example.
What would you do, for example, if you want to selfhost this?
- which models (qwen ai coder?)
- which api (with ollama? Bolt? Aider? Etc)
- how to integrate PRs with a local gitlab/gogs/forgejo instance? Do you need another MCP agent for git that does that?
- which hardware dependencies to run it?
I am currently trying to figure out how to implement a practical workflow for this. So far I'm using still a synchronous MCP agent setup where it basically runs on another machine in the network because I have a too unperformant laptop to work with.
But how would I get to the point of async MCP agents that can work on multiple things in my Go codebases in parallel? With the mentioned PR workflows so that I can modify/edit/rework before the merges?
The author makes a lot of claims and talks always about that their opponents in the argument are not talking about the same thing. But what exactly is the same thing, which is reproducible locally for everyone?
I can recommend govim, it's using LSP behind the scenes. I'd also recommend tagbar because it uses the newer ctags fork which supports much more languages.
I have https://github.com/prabirshrestha/vim-lsp in my minimal config and pretty much the default configuration for it. I only use LSPs for completion, auto imports, rename, and goto definition (less useful when the project is small) and that has been enough for me.
The self hosted and vim / nvim / emacs integrated llm / agentic / MCP situation is being implemented real time as we speak. There's no good canonical solution right now, especially for self hosting. If you are willing to make API calls to openapi or Microsoft, your options expand dramatically.
I think most people are using off the shelf paid solutions like cursor, windsurf, warp terminal, Claude code.
I'd love to get a set up with a locally hosted full version of deepseek.
I use vim as well, and one editor-agnostic coding assistant that is pretty decent is Aider. It is a little buggy sometimes, but it's been a blast to experiment with. I run it in a separate split with Tmux. Its whole schtick is to use it as a pair programming device. So I'll edit code in one split and use Aider in another, alternating at key points. This has worked pretty well for personal projects. I typically don't do this at $dayjob, though.
Can someone comment on the cost of running agentic models? Not for a company but for an individual. I tried "vibe coding" a personal project I was struggling with and left even more frustrated because I kept running into token rate limits with Claude (used inside of Zed if it matters). Did I pick the wrong model, the wrong editor, or do I just need to not be so tight with my money?
I tried Zed's agent support with Copilot. I was able to get it to implement a working feature with tests, but the code quality was poor and it took longer to do it than if I had written it myself.
I am wondering if maybe the average programmer is way slower and worse than I thought.
I haven't used zed specifically, but were you using your own API key for Claude? If so, you were probably running into the Anthropic API rate limits [1]. You can either a) deposit more funds to move your way up the tier list, or instead access Claude via something like OpenRouter, which will give you much higher limits.
Machine translation and speech recognition. The state of the art for these is a multi-modal language model. I'm hearing impaired veering on deaf, and I use this technology all day every day. I wanted to watch an old TV series from the 1980s. There are no subtitles available. So I fed the show into a language model (Whisper) and now I have passable subtitles that allow me to watch the show.
Am I the only one who remembers when that was the stuff of science fiction? It was not so long ago an open question if machines would ever be able to transcribe speech in a useful way. How quickly we become numb to the magic.
That's not quite true. State of the art both in speech recognition and translation is still a dedicated model only for this task alone. Although the gap is getting smaller and smaller, and it also heavily depends on who invests how much training budget.
The current best ASR model has 600M params (tiny compared to LLMs, and way faster than any LLM: 3386.02 RTFx vs 62.12 RTFx, much cheaper) and was trained on 120,000h of speech. In comparison, the next best speech LLM (quite close in WER, but slightly worse) has 5.6B params and was trained on 5T tokens, 2.3M speech hours. It has been always like this: With a fraction of the cost, you will get a pure ASR model which still beats every speech LLM.
The same is true for translation models, at least when you have enough training data, so for popular translation pairs.
However, LLMs are obviously more powerful in what they can do despite just speech recognition or translation.
The problem with Google-Translate-type models is the interface is completely wrong. Translation is not sentence->translation, it's (sentence,context)->translation (or even (sentence,context)->(translation,commentary)). You absolutely have to be able to input contextual information, instructions about how certain terms are to be translated, etc. This is trivial with an LLM.
(This is not the best example as I gave it free rein to modify the text - I should post a followup that has an example closer to a typical use of speech recognition).
Without that extra cleanup, Whisper is simply not good enough.
> However, LLMs are obviously more powerful in what they can do despite just speech recognition
Unfortunately, one of those powerful features is "make up new things that fit well but nobody actually said", and... well, there's no way to disable it. :p
> Am I the only one who remembers when that was the stuff of science fiction?
Would you go to a foreign country and sign a work contract based on the LLM translation ?
Would you answer a police procedure based on the speech recognition alone ?
That to me was the promise of the science fiction. Going to another planet and doing inter-species negotiations based on machine translation. We're definitely not there IMHO, and I wouldn't be surprised if we don't quite get there in our lifetime.
Otherwise if we're lowering the bar, speech to text has been here for decades, albeit clunky and power hungry. So improvements have been made, but watching old movies is a way too low stake situation IMHO.
this is very dismissive and binary and thats what this whole article is about. AI skeptic expect that's either AGI or perfect with all use cases or otherwise its useless. SST, translation and TTS went really far away in last 2 years. My mother who doesn't speak english find it very useful when she my sister in US. I find it super useful while travelling in asia. Definitely much more useful than what we had in Google Translate.
I tried speech recognition many times over the years (Dragon, etc). Initially they all were "Wow!", but they simply were not good enough to use. 95% accuracy is not good enough.
Now I use Whisper to record my voice, and have it get passed to an LLM for cleanup. The LLM contribution is what finally made this feasible.
It's not perfect. I still have to correct things. But only about a tenth of the time I used to. When I'm transcribing notes for myself, I'm at the point I don't even bother verifying the output. Small errors are OK for my own notes.
Have they solved the problem of Whisper making up plausible sounding junk (e.g. such that reading it you would have no idea it was completely hallucinated) when there is any silence or pause in the audio?
I completely agree that technology in the last couple years has genuinely been fulfilling the promise established in my childhood sci-fi.
The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
Two years ago this story would have been: notice the special board, realize I don't quite understand all the characters well enough to choose or order, and turn wistfully to the menu to hopefully find something familiar instead. Or skip the bistro and grab a pre-packaged sandwich at a convenience store.
> I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
> Two years ago
This functionality was available in 2014, on either an iPhone or android. I ordered specials in Taipei way before Covid. Here's the blog post celebrating it:
This is all a post about AI, hype, and skepticism. In my childhood sci-fi, the idea of people working multiple jobs to still not be able to afford rent was written as shocking or seen as dystopian. All this incredible technology is a double edges sword, but doesn't solve the problems of the day, only the problems of business efficiency, which exacerbates the problems of the day.
>The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
To be fair apps dedicated apps like Pleco have supported things like this for 6+ years, but the spread of modern language models has made it more accessible
Definitely not. I took this same basic idea of feeding videos into Whisper to get SRT subtitles and took it a step further to make automatic Anki flashcards for listening practice in foreign languages [1]. I literally feel like I'm living in the future every time I run across one of those cards from whatever silly Finnish video I found on YouTube pops up in my queue.
These models have made it possible to robustly practice all 4 quadrants of language learning for most common languages using nothing but a computer, not just passive reading. Whisper is directly responsible for 2 of those quadrants, listening and speaking. LLMs are responsible for writing [2]. We absolutely live in the future.
Hi Andrew, I've been trying to get a similar audio language support app hacked together in a podcast player format (I started with Anytime Player) using some of the same principles in your project (transcript generation, chunking, level & obscurity aware timestamped hints and translations).
I really think support for native content is the ideal way to learn for someone like me, especially with listening.
Translation seems like the ideal application. It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation. Even if it has to spit out three versions to perfectly communicate, it’s still leaps and bounds ahead of traditional translators already.
> it’s still leaps and bounds ahead of traditional translators already
Traditional machine translators, perhaps. Human translation is still miles ahead when you actually care about the quality of the output. But for getting a general overview of a foreign-language website, translating a menu in a restaurant, or communicating with a taxi driver? Sure, LLMs would be a great fit!
> It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation.
Somehow LLMs can't do that for structured code with well defined semantics, but sure, they will be able to extract "obscure references" from speech/text
I started watching Leverage, the TV show, on Amazon, and the subtitles in the early series are clearly AI generated or just bad by default.
I use subtitles because I don’t want to micromanage the volume on my TV when adverts are forced on me and they are 100x louder than than what I was watching.
Old TV series should have closed captions available (which are apparently different from subtitles), however the question of where to obtain aside from VHS copies them might be difficult.
Many DVDs of old movies and TV shows may contain the closed captions, but they are not visible through HDMI. You have to connect your DVD player to your TV via the composite video analogue outputs.
Last time I used Whisper with a foreign language (Chinese) video, I’m pretty sure it just made some stuff up.
The captions looked like they would be correct in context, but I could not cross-reference them with snippets of manually checked audio, to the best of my ability.
I tried whisper with a movie from the 60's and it was a disaster.
Not sure if it was due to the poor quality of the sound, the fact people used to speak a bit differently 60 years ago or that 3 different languages were used (plot took place in France during WW2).
I feel you. In the late 00's/early 10's, downloading and getting American movies were fairly easy but getting the subtitles was a challenge. It was even worse with movies from other regions. Even now I know people that record conversations to be replayed using Whisper so they can get 100% the info from it.
Disclaimer: I'm not praising piracy but outside of US borders is a free for all.
Using AI to generate subtitles is inventive. Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?
As someone who has started losing the higher frequencies and thus clarity, I have subtitles on all the time just so I don't miss dialogue. The only pain point is when the subtitles (of the same language) are not word-for-word with the spoken line. The discordance between what you are reading and hearing is really distracting.
This is my major peeve with my The West Wing DVDs, where the subtitles are often an abridgement of the spoken line.
> Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?
Yes, Whisper has been able to do this since the first release. At work we use it to live-transcribe-and-translate all-hands meetings and it works very well.
I don't think you are also including having AI lie of "hallucinating" to us which is an important point even if the article is only about having AI write code for an organization.
What is the relevance of this comment? The post is about LLMs in programming. Not about translation or NLP, two things transformers do quite well and that hardly anyone contests.
Everything about that is true but, and that's a big BUT, the code I write with LLM I can only iterate on it with an LLM.
My mind doesn't develop a mental model of that code, I don't know where the relevant parts are, I can't quickly navigate through it and I have to reach the LLM for every small change.
Which is why I like Copilot style editing more than agents as a working model but agents are just so much more powerful and smarter thanks to everything available to them.
This is in line with "Programming As Theory Building"[0] (Peter Naur, 1985) that puts forward the idea that when one is building a system one builds a "theory" (in a sense a mapping of what, where and why) of that system that is essential for continued work on it. Once I internalized this idea it helped me make sense of why rewrites are not "never a good idea" and simultaneously why they are so fraught with peril, as well as what kind of onboarding actually makes sense for work on a system. It's also why I can never get on board with the passive "review code an LLM spits out" type of development; it just does not build theory as reliably, effectively and efficiently.
It's a good, very digestible paper, in my opinion; well worth the read.
This is exactly right. This is what's happened with every vibe coded codebase I've made. As a result, I only selectively use LLMs now with sections of the codebase. Or I'll just relinquish control to the LLM. I think the latter will happen en masse, which is scary.
You can ask the llm to write code the way you think about it. I usually write a little spec document as my first prompt, and in there i'll add some rules on how I want code structured, seperated etc.
If you use cursor, cursorrules can be a good way of doing this.
This still doesn't work for me. I have to actively think about the problem, find a solution, and write the code myself. Outsourcing these efforts makes my learning ineffective.
I will keep using LLMs for 1 week greenfield hobby projects that I don't plan to revisit. But no matter how good LLMs get, I will never use them in my dailyjob, otherwise, I risk losing touch with the codebase.
> This was the craftsman’s ‘Golden Age’ and much time and trouble was taken over the design of tools. Craftsmen were being called upon to do more skilful and exacting work and the use of tools and the interest in development had become very widespread.
Above pulled from A Brief History of the Woodworking Plane [0]. A woodworking tool that has evolved over 2,000 years. Now there are electric planers, handheld electric planers and lots of heavy machinery that do the same thing in a very automated way. If a company is mass producing kitchen cabinets, they aren't hand planing edges on boards, a machine is doing all that work.
I feel like with AI we are on the cusp of moving beyond a "Golden age" and into an "industrial age" for coding, where it will become more important to have code that AI understands vs. something that is carefully crafted. Simple business pressure will demand it (whether we like it or not).
^ A comment I made just yesterday on a different thread.
For software developers AI is like the cabinet maker that gets a machine to properly mill and produce cabinet panels, sure you can use a hand plane to do that but you're producing a very different product and likely one that not many people will care about, possibly not even your employer when they see all the other wood shops pumping out cabinetry and taking their market share.
The difference is that programming is creating the design for the furniture and the process to assemble it using machines. Compilation and deployment is the factory in this case. So you're trying to automate away the design aspect where we already have the other aspects automated.
I think the hardest part is not spending the next 3 months of my life in a cave finishing all the hobby/side projects I didn't quite get across the line.
It really does feel like I've gone from being 1 senior engineer to a team that has a 0.8 Sr. Eng, 5 Jrs. and one dude that spends all his time on digging through poorly documented open source projects and documenting them for the team.
Sure I can't spend quite as much time working on hard problems as I used to, but no one knows that I haven't talked to a PM in months, no one knows I haven't written a commit summary in months, it's just been my AI doppelgangers. Compared to myself a year ago I think I now PERSONALLY write 150% more HARD code than I did before. So maybe, my first statement about being 0.8 is false.
I think of it like electric bikes, there seems to be indication that people with electric assist bikes actually burn more calories/spend more time/go farther on an electric bike than those who have manual bikes https://www.sciencedirect.com/science/article/abs/pii/S22141....
I don't know what you're posting, but if it's anything like what I see being done by GitHub copilot, your commit messages are junk. They're equivalent to this and you're wasting everyone's time:
Try Cubic, which is a Github add-on. Really good at writing GH commit messages and also surfaces bugs fairly reliably (adds PR comments). Not affiliated, just a user.
If you've ever run or been part of a team that does thorough, multi-party, pull request reviews you know what I am talking about.
The only part I don't automate is the pull request review (or patch review, pre-commit review, etc. before git.), thats always been the line to hold for protecting codebases with many contributors of varying capability, this is explicitly addressed in the article as well.
You can fight whatever straw man you want. Shadowbox the hypotheticals in your head, etc. I don't get all these recent and brand new accounts just straight up insulting and insinuating all this crap all over HN today.
For me, the electric bike analogy works differently: it enables people to ride, regularly, who would not be able to do that with traditional bikes. That's totally fine. But electric bikes don't threaten to take away our normal bikes.
By the reply count it appears this post was excellent at trolling HN's core audience :-).
The author may be right, but his preface "look I'm one of you" was in a very different tone to the rest of his "now shut up and get to work" rant.
I do use the AI tools to some extent, if for no other reason than that they are currently the path of least resistance, and google+friends have lately played themselves out of the game.
He is probably right we should get acquainted with using agents before dismissing it :-).
The article is chefs_kiss.jpg. the author is core-hn, in spirit and in karma.
imo that tone is exactly what's needed to set right the entitlement that we SWEs have developed over the last decade. The key line in the article gets to the heart of the actual concern developers have imo.
> LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades.
As someone who has spent the better part of today fixing the utter garbage produced by repeated iteration with these supposedly magical coding agents, I'm neither in the camp of the "AI skeptic" (at least as defined by the author), nor am I in the camp of people who thinks these things can "write a large fraction of all the tedious code you’ll ever need to write."
Maybe I'm doing it wrong, but I seem to have settled on the following general algorithm:
* ask the agent to green-field a new major feature.
* watch the agent spin until it is satisfied with its work.
* run the feature. Find that it does not work, or at least has major deficiencies [1]
* cycle through multiple independent iterations with the agent, doing something resembling "code review", fixing deficiencies one at a time [2]
* eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.
Repeat.
It's not that the things are useless or "a fad" -- they're clearly very useful. But the people who are claiming that programmers are going to be put out of business by bots are either a) talking their book, or b) extrapolating wildly into the unknown future. And while I am open to the argument that (b) might be true, what I am observing in practice is that the rate of improvement is slowing rapidly, and/or the remaining problems are getting much harder to solve.
[1] I will freely grant that at least some of these major deficiencies typically result from my inability / unwillingness to write a detailed enough spec for the robot to follow, or anticipate every possible problem with the spec I did bother to write. T'was ever thus...
[2] This problem is fractal. However, it's at least fun, in that I get to yell at the robot in a way that I never could with a real junior engineer. One Weird Fact about working with today's agents is that if you threaten them, they seem to do better work.
Even on stuff it has no chance of doing on its own, I find it useful to basically git reset repeatedly and start with more and more specific instructions. At the very least it helps me think through my plan better.
Yeah...I've toyed with that, but there's still a productivity maximum where throwing it all away and starting from scratch is a worse idea, probabilistically, than just fixing whatever thing is clearly wrong.
Just to make it concrete, today I spent a few hours going through a bunch of HTML + embedded styles and removing gobs and gobs of random styles the LLMs glommed on that "worked", but was brittle and failed completely as soon as I wanted to do something slightly different than the original spec. The cycle I described above led to a lot of completely unnecessary markup, paired with unnecessary styles to compensate for the crappiness of the original DOM. I was able to refactor to a much saner overall structure, but it took some time and thinking. Was I net ahead? I don't really know.
Given that LLMs almost always write this kind of "assembled from StackOverflow" code, I have precisely 0% confidence that I'd end up in a better place if I just reset the working branch and started from scratch.
It kind of reminds me of human biology -- given billions of years of random evolution you can end up with incredible sophistication, but the end result will be incomprehensible and nearly impossible to alter.
> eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.
As it stands AI can't even get out of Lt Surge's gym in Pokemon Red. When an AI manages to beat Lance I'll start to think about using it for writing my code :-)
My workflow is similar. While the agent is running, I browse the web or day dream. If I'm lucky, the agent produced correct code (after possibly several cycles). If I'm not, I need to rewrite everything myself. I'm also not in any camp and I genuinely don't know if I'm more or less productive overall. But I think that a disciplined use of a well-integrated agent will make people more productive.
Results can vary significantly, and in my experience, both the choice of tools and models makes a big difference.
It’s a good idea to periodically revisit and re-evaluate AI and tooling. I’ve noticed that many programmers tried AI when, for example, GPT-3.5 was first released, became frustrated, and never gave it another chance—even though newer models like o4-mini are now capable of much more, especially in programming tasks.
AI is advancing rapidly. With the latest models and the right tools, what’s possible today far exceeds what was possible even just a short time ago (3-12 months ago even).
Take a look at Cursor or Windsurf or Roo code or aider to "feed" AI with code and take a look at models like Google Gemini 2.5 Pro, Claude Sonnet 4, OpenAI o4mini. Also educate yourself about agents and MCP. Soon that will be standard for many/every programmer.
Absurd take. Speed is not the issue! Optimising for speed of production is what got us into the utter quagmire that is modern software.
Lack of correctness, lack of understanding and ability to reason about behaviour, and poor design that builds up from commercial pressure to move quickly are the problems we need to be solving. We’re accelerating the rate at which we add levels to a building with utterly rotten foundations.
Why is this line of thinking so common with AI folk? Is it just inconceivable to you that other people have different experiences with a technology that has only become widespread in the past couple years and that by its very nature is non deterministic?
As if going faster is the only goal of a programmer.
Some simulation I worked on for 2 months were in total 400 lines of code. Typing it out was never the bottleneck. I need to understand the code so that when I am studying the code for the next 1 1/2 months I can figure out if the problem is a bug in my code, or the underlying model is wrong.
I've not yet been in a position where reading + cleaning up the LLMs bad code was faster and/or produced better code than if I wrote it by hand. I've tried. Every time someone comes up and says "yeah of course you're not using GPT4.7-turbo-plus-pro" I go and give a spin on the newfangled thing. Nope, hasn't happened yet.
I admit my line of work may not be exactly generic crud work, but then again if it's not useful for anything just one step above implementing a user login for a website or something, then is it really gonna take over the world and put me out of a job in 6 months?
It is absolutely hilarious to read the responses from people who can’t use ai make attempts to justify their ability to code better than ai. These are the people who will be replaced. They are fighting so hard against it instead of learning how to use it.
“I wrote 400 lines of code I don’t understand and need months to understand it because ai obviously cant understand it or break it down and help me document it”
“Speed is what caused problems! Because I don’t know how to structure code and get ai to structure it the same it’s obviously going rogue and doing random things I cannot control so it’s wrong and causing a mess!!!”
“I haven’t been able to use it properly so don’t know how to rein it in to do specific tasks so it produces alot of stuff that takes me ages to read! I could have written it faster!!!”
I would love to see what these people are doing 1-2 years from now. If they eventually click or if they are unemployed complaining ai took their jobs.
I suspect a large proportion of claims made for productivity increases are skewed by the fact that the speed at which code is produced by AI makes you _feel_ productive, but these gains are largely replaced by the effort to understand, refactor, review and clean up the code. The high that you get when something "works" tends to stick more in your memory than the time when you had to spend a day cleaning up dead code, refactoring 2k line modules into a more readable project structure etc.
I'm not saying that AI can't make you productive, it's just that these claims are really hard to verify. Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May), producing a single file with >2k lines. Is that going to be harder to maintain than a codebase with a proper project structure that's easily parseable by a human?
Indeed the speed with which we were able to turn this around was critical to us, as it allowed us to have our Remote MCP framework ready immediately when the spec was finalized, which led to quite a few companies building MCP servers on Cloudflare: https://blog.cloudflare.com/mcp-demo-day/
I'm not an AI maximalist. I still write lots of code by hand, because there's a lot AI isn't good at. It's good at boilerplate and straightforward code, it's bad at refactoring deep systems. But AI assistance was undeniably a huge win for the OAuth project. There's no way I could have written that library by hand so quickly. (Maybe when I was 25 and had no responsibilities, but these days I have like 1 solid day a week to actually write code...)
Ok sorry to get abstract but to me what you are talking about is differentiating between understanding and correctness. We as humans, for now, need to understand the code and that's not easily transmitted from the output of some AI. In fact, that's a hard problem. But I don't think it's impossible for AI to assist humans with that. The AI could help walk humans through the code to understand quickly what's going on. Maybe ultimately the issue here is trust. Do we trust the AI to write code. Maybe we spend more time trying to verify it for now. I think that shows we place a lot of trust in humans to write code. Maybe that changes.
This is cope. I know my own experience, and I know the backgrounds and problem domains of the friends I'm talking to that do this stuff better than I do. The productivity gains are real. They're also intuitive: if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.
If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.
We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!
Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.
Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.
> if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
No senior-level engineer worth their salt, and in any kind of minimally effective organization, is spending any meaningful amount of their time doing the rote repetition stuff you're describing here. If this is your experience of work then let me say to you very clearly: your experience is pathological and non-representative and you need to seek better employment :)
The thing I don't understand is that you keep bringing up your friends' experience in all your responses and in the blog itself. What about your experience and your success rate and productivity gain that you observed with AI agent? It feels like you yourselves aren't confident on your gain and must bring up second hand experience from your friends to prop up your arguments
Regardless of what people say to you about this, most (all?) undergraduates in CS programs are using LLMs. It's extremely pervasive. Even people with no formal training are using AI and vercel and churning out apps over the weekend. Even if people find reasons to dislike AI code writing, culturally, it's the future. I don't see that changing. So either a huge percent of people writing code are doing it all wrong or times are changing.
I think it has a lot to do with the type of work you are doing. I am a couple of years into a very small startup that has some actual technology built (as opposed to a really simple CRUD app or something).
When I am working on the front-end where things are pretty simple AI is a huge speed up. What it does VERY well it latch on to patterns and then apply those patterns to other things. If it has a couple of examples you can point it to and say "ok build that but over here" the newest revisions of Claude and Gemini are perfectly capable of building the whole thing end to end. Because it's a fairly repetitive task I don't have to spend much time untangling it. I can review it and pattern match against things that don't look right and then dive into those.
For a real example, I needed a page for a user to manually add a vendor in our platform. A simple prompt asking Claude to add a button to the page sent into a mode where it added the button, built the backend handler, added the security checks, defined a form, built another handler to handle the submitted data, and added it to the database. It even wrote the ACL correctly. The errors it introduced were largely around using vanilla HTML in place of our standard components and some small issues with how it attempted to write to the DB using our DB library. This saved me a couple of hours of typing.
Additionally if I need to refactor something AI is a godsend. Just today an underlying query builder completely changed its API and broke..everything. Once I identified how I wanted to handle the changes and wrote some utilities I was able to have Claude just find everything everywhere and make those same changes. It did it with like 90% accuracy. Once again that saved me a couple of hours.
Where it fails, usually spectacularly, is when we get to the stuff that is new or really complex. If it doesn't have patterns to latch onto it tries to invent them itself and the code is garbage. Rarely does it work. Attempting to vibe code it with increasingly more pointed prompts will often result in compiling code but almost never will it do the thing I actually wanted.
In these contexts it's usefulness is mostly things like "write a sql query to do X" which occasionally surfaces a technique I hadn't thought about.
So my experience is pretty mixed. I am definitely saving time. Most of it is typing time not thinking time. Which is like 1/3 of my average day. If I had to guess I am somewhere in the neighborhood of 30-40% faster today than I was in 2019. Notably that speed up has allowed me to really stretch this funding round as we are well past the phase where we would have typically hired people in my past companies. Usually someone relatively mid-level to take over those repetitive tasks.
Instead it's just me and a non-technical founder going along super quickly. We will likely be at a seed round before anyone new comes in.
> But all day, every day, a sizable chunk of the front page of HN is allocated to LLMs: incremental model updates, startups doing things with LLMs, LLM tutorials, screeds against LLMs. It’s annoying!
You forgot the screeds against the screeds (like this one)
Let's just imagine we're critiquing cars or planes in about 1910. They're so flawed. No one can say with certainty whether or how they will improve.
Side note of interest, from Wikipedia:
"Flying Machines Which Do Not Fly" is an editorial published in the New York Times on October 9, 1903. The article incorrectly predicted it would take one to ten million years for humanity to develop an operating flying machine.
I think we have heated debate because most people don't explain what's their `cost function`. For someone when they talk about AI they take a binary reasoning that if something is not perfect today then it will never be perfect or will never improve. For other is just they see something is useful today and know it will get better next year and don't have expectation of getting AGI.
In your reply it's equivalent of someone expecting AGI in next decade. The same is when people talk about if AI will take software dev jobs. Some just see all the flows in AI and they know they job is secure. Some other see that they are 2x productive and potentially your team mate not needed anymore. If AI can eliminate 50% of IT jobs in ~10-20 years then thats still job replacement. When we replaced horses with cars that doesn't mean we have no horses today or that nobody ride horses.
If we had funded that as much as we're currently funding AI I think it would have been a plausible goal. Keep in mind we cut more than half of NASA's budget after we first landed.
It's a good idea to think of a converse situation, but this is a bad example. The constraint was not about technology but about budget, perceived benefits and political will.
I'm mostly skeptical about AI capabilities but I also think it will never be a profitable business. Let's not forget AI companies need to recoup a trillion dollars (so far) just to break even [1].
VCs are already doubting if the billions invested into data centers are going to generate a profit [1 and 2].
AI companies will need to generate profits at some point. Would people still be optimistic about Claude etc if they had to pay say $500 per month to use it given its current capabilities? Probably not.
So far the only company generating real profits out of AI is Nvidia.
Some of the revenues are very real. A few million subscriptions at tens-hundreds of dollar per month add up to non trivial revenue pretty quickly. I think most software engineers will have such subscriptions fairly soon. We're talking about a market where companies drop 5-20K/month on software engineers and hire whole teams of those. Of course they are going to spend on this. Maybe not 500$. But 20-100$ is a lot less controversial.
And this is quickly spreading beyond software engineering. Software engineers are just being guinea pigs for agentic AIs eventually popping up in all sectors. Basically, while security and quality issues are being sorted out, it helps having users that are a bit more clued in about what they are doing.
That's why AI investments are so hot right now. Of course there are a lot of AI companies that will fall short. There always are. And companies like Nvidia that will make a lot of money selling GPUs.
But there is some non trivial amount of revenue potential there. Anybody still in denial about that is probably penny pinching.
I refuse to pay for any product where I'm the product. Run it locally, maybe I'll pay for it but never online.
My guess would also be that at the 100$ price point only one company can be profitable but that is just a very wild guess.
For what I've seen SE's are some of the last to adopt it. My marketing colleague has been overflowing in generic AI crap produced by external writers for over a year now.
Right now Google, OpenAI, Anthropic, etc are playing a game of chicken. But even after that it's not clear if the remaining players will be able to generate a profit.
Even if AI companies can recoup the billions or trillions of dollars invested, how long will that take and what will be their margin after that?
What about the free open weights models then? And the open source tooling to go with them?
Sure, they are perhaps 6 months behind the closed-source models, and the hardware to run the biggest and best models isn't really consumer-grade yet (How many years could it be before regular people have GPUs with 200+ gigabytes vram? That's merely one order of magnitude away).
But they're already out there. They will only ever get better. And they will never disappear due to the company going out of business or investors raising prices.
I personally only care about the closed sourced proprietary models in so far as they let me get a glimpse of what I'll soon have access to freely and privately on my own machine. Even if all of them went out of business today, LLMs would still have a permanent effect on our future and how I'd be working.
This entire section reads like, oddly, the reverse of the "special pleading" argument that I usually see from artists. Instead of "Oh, it's fine for other fields, but for my field it's a horrible plagiarism machine", it's the reverse: "Oh, it's a problem for those other fields, but for my field get over it, you shouldn't care about copyright anyway".
I'm all for eliminating copyright. The day I can ignore the license on every single piece of proprietary software as I see fit, I'll be all for saying that AIs should be able to do the same. What I will continue to complain about is the asymmetry: individual developers don't get to violate individual licenses, but oh, if we have an AI slurp up millions of codebases and ignore their licenses, that's fine.
No. No, it isn't. If you want to ignore copyright, abolish it for everyone. If it still applies to everyone else, it should still apply to AIs. No special exceptions for mass-scale Open Source license violations.
I think where tptacek is right, though, is that if we're going to hold this position without hypocrisy, then we need to respect copyright as long as it exists. He's right that many of us have not done that; it's been very common to violate copyright for mere entertainment. If we want the licenses of our own work to be respected, then we need to extend that respect to others as well, regardless of the size of the copyright holder.
There are things that "modulate" this. Violating copyright is never right, of course, some questions are however scale, and purpose. Taking others' creative output, unlicensed, for large-scale commercial gain, is about the worst.
The whataboutism of that section was odd. The only non-handwavy argument presented is that due to the scale of LLM training that models' output should be treated like US-specific typeface forms' copyright, ie: non-applicable.
It's interesting as typeface plagiarism became rampant beginning in the 70s when more accurate photo reproductions made it trivial. This was problematic for designers wanting to make a livelihood, which is something ITC sought to mitigate by better up-front payments (IIRC from U&lc's coverage) to incentivize quality typeface creation.
There's a distinction though between literal plagiarism and just inspiration from elements. US copyright law doesn't protect either for typeface forms but ironically it does allow copyright for the code used in font files.
I've seen OpenAI's o3-mini (their reasoning model) suggest verbatim code and comments that I found on Github predating LLMs by years. It seems the more times the same code and comment appears online the more likely this is to occur. I'd imagine there would be studies looking into the scope and frequency this occurs and how much is considered fair use.
> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge!
How does this section fit in with the agent section just after? In an agentic model, isn't the merge getting done by either the model or a tool, and the retry-loops on failures would be mostly invisible?
E.g. when using Aider + Gemini Flash 2.5, probably 90% of the changes apply cleanly from my perspective (maybe half actually apply cleanly, the other half after a couple of roundtrips of Aider telling the model that the patch didn't apply). The 10% that only apply partially I usually throw away and redo the prompt, it's really rare that I start merging the code manually.
As a junior, I feel most complexity in software is manufactured. LLMs simplify that mess for me, making it easier to get things done. But I’m constantly hit with imposter syndrome, like I’m less skilled because I rely on AI to handle the tricky stuff. And Gemini is better than me!
A big problem is that you're either hearing breathless over the top insane hype (or doomerism, which is breathless over the top hype taken to a dark place) or skepticism that considers AI/LLMs to be in the same league as NFTs.
Neither of these is accurate, but I guess nuanced thinking or considering anything below surface vibes is out these days.
So far after playing with them I'm using them as:
1. A junior intern that can google really really fast and has memorized a large chunk of the Internet and the library, and can do rough first-pass research and dig for things.
2. Autocomplete 2.0 that can now generate things like boilerplate or fairly pedestrian unit tests.
3. Rubber duck debugging where the rubber duck talks back.
4. A helper to explain code, at least for a first pass. I can highlight a huge piece of code and ask it to summarize and then explain and walk me through it and it does a passable job. It doesn't get everything right but as long as you know that, it's a good way to break things down and get into it.
For those things it's pretty good, and it's definitely a lot of fun to play with.
I expect that it will get better. I don't expect it to replace programmers for anything but the most boring mindless tasks (the ones I hate doing), but I expect it to continue to become more and more useful as super-autocomplete and all the other things I listed.
> But AI is also incredibly — a word I use advisedly — important. It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.
However, I just don't think the AI coding part is that interesting or future-thinking. We're seeing so much more progress in semantic search, tool calling, general purpose uses, robotics, I mean, DeepMind just won a Nobel for goodness' sake.
Don't get me wrong, I use ChatGPT to write all kinds of annoying boilerplate, and it's not too bad at recalling weird quirks I don't remember (yes, even for Rust). But hard problems? Real problems? Zero shot. Novel problems? No way.
> But I’ve been first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months.
I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
Listen, I also think we're on the precipice of re-inventing how we talk to our machines; how we automate tasks; how we find and distribute small nuggets of data. But, imo, coding just ain't it. Donald Knuth calls computer programming an art, and to rob humanity of effecting not just coding—but any art, I'd argue—would be the most cardinal of sins.
> I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
I have this problem with a lot of LLM miracle anecdotes. There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.
>There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.
On the other hand, we've all probably had the experience of putting out a fire and wanting time to track down an issue only to be told to not bother since "everything is working now". Sometimes you spend months complaining about something because the people you're complaining to don't have the time to dive into an issue. Even if it would have taken mere hours for a human to hunt down the issue, someone still has to be given those hours to work on it. By contrast, copying a bunch of logs into an AI is nearly free.
> You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
Eh, I've worked on projects where because of un-revisited logging decisions made in the past, 1-10k error logs PER MINUTE were normal things to see. Finding the root cause of an issue often boiled down to multiple attempts at cleaning up logs to remove noise, cleaning up a tangentially related issues and waiting for it to happen again. More than one root cause was discovered by sheer happenstance of looking at the right subset of the logs at the right moment in time. I can absolutely buy that a system built for parsing large amounts of text and teasing patterns out of that found in minutes what humans could not track down over months.
One point is missing for me - you get lazy. People are less and less equiped to think about complex problems the more they RELY on this for coding. Also this article is just about coding.
Not sure why more people aren't mentioning this. But that's the exact reason I've stopped using any LLM stuff. Your thinking just gets lazier and lazier the more you use it. You stop thinking the problems deeply and thoroughly when the "answer" is just around the corner. Then one day you snap out of that mirage, and see what a boatload of junk you've just created with this stuff.
I also dread what's going to happen to the younger generations which have this to replace their real education, but that's a bit offtopic here.
My CEO friends that overstate the efficacy of LLMs are all nuts.
(I say this as someone who has been working in "AI" as part of Digital transformation roadmaps for SEVEN years and as someone who has used LLM's daily in my workflow for 3 years).
I have to say, my ability to learn Rust was massively accelerated via LLMs. I highly recommend them for learning a new skill. I feel I'm roughly at the point (largely sans LLMs) now where I can be nearly as productive in Rust as Python. +1 to RustRover as well, which I strongly prefer to any other IDE.
The interesting question is: is it really that bad at Rust, or does Rust's strict compiler just catch more errors which remain hidden in, let us say, Go? The usual hand-waving response is that developers should write more tests instead - as if a boring and tedious task such as writing tests will not be passed to LLM.
I've been writing Rust code in production for 4+ years, and I can write Rust pretty well, and I've learned a lot from using chatgpt and co-pilot/cursor.
In particular, it helped me write my first generic functions and macros, two things that were pretty intimidating to try and get into.
How does anyone self learning know they're learning the "right things and the best way to do things"? By putting the stuff they've learned into practice and putting it up against the real world. How many Rust tutorials are out there that teach things incorrectly, non-idomatically or just inefficiently? How does anyone not already an expert know except by trying it out?
It is not bad at rust. I don't think I could even function well as a Rust programmer without chatgpt and now Cursor. It removes a lot of the burden of remembering how to write generic code and fixing borrow checking stuff. I can just write a generic function with tons of syntax errors and then tell cursor to fix it.
Not to derail, but NFT mania (part of the opening salvo in the article) was the giant shitshow that it was -not- because the concept of unique digital bits in the possession of a single owner was a bad idea (or, the concept of unique verification of membership in a club was a bad idea) -- it was a diarrhea-slicked nightmare because it was implemented via blockchains and their related tokens, which inherently peg fluctuating fiat value to the underlying mechanisms of assigning and verifying said ownership or membership, and encourages a reseller's market .. not to mention the perverse, built-in economic incentives required to get nodes to participate in that network to make the whole thing go.
Had NFTs simply been deployed as some kind of protocol that could be leveraged for utility rather than speculation, I think the story would be a complete 180. No clue personally how to achieve that, but it feels like it could be done.. except that, too, would have been completely perverted and abused by centralized behemoths, leading to a different but terrible outcome. Can you imagine if all data became non-fungible? Convince all the big identity vendors (Google, Apple, etc) to issue key pairs to users that then get used by media companies to deliver audio and video keyed only to you that's embedded with maybe some kind of temporal steganographic signature that's hard to strip and can be traced back to your key? It's not just cracking AACS once and copying the bytes. It becomes this giant mess of you literally can't access anything without going through centralized authorities anymore. Then build more anti-patterns on top of that lol. Prolly better that it was mostly just monkey JPEGs and rug pulls.
Anyways, I'm so far off topic from what's actually being discussed -- just couldn't help myself from veering into left field.
> Fly.io builds on the work of a tremendous open source infrastructure community. We want open source authors to benefit from their work because we think a healthy, thriving open source ecosystem will help us build better products.
To the fine article:
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass.
Imho, the fly.io people are largely performative HN attention seekers that practice writing posts to produce engagement. It's almost charming in the way it resembles the early-on blogosphere, and you should read them as token summaries of "this is a popular sentiment right now, framed to appear a bit wise and edgy". They're often wordy versions of "that one trick they don't want you to know" or "am I really the only one that [actually quite popular thing]".
In the sense that the posts distill whatever HN at the time would accept as thought leadership it's a good pulse check on the community.
There's a huge caveat i don't see often, which is that it depends on your language for programming. IE. AI is reallllly good at writing Next.js/Typescript apps, but not so much Ruby on Rails. YMMV
I agree with this. People who are writing Python, Javascript, or Typescript tell me that they get great results. I've had good results using LLMs to flesh out complex SQL queries, but when I write Elixir code, what I get out of the LLM often doesn't even compile even when given function and type specs in the prompt. As the writer says, maybe I should be using an agent, but I'd rather understand the limits of the lower-level tools before adding other layers that I may not have access to.
My hunch is that to exploit LLMs one should lean on data driven code more. LLMs seem to have a very easy time to generate data literals. Then it's far less of an issue to write in a niche language.
Not familiar with Elixir but I assume it's really good at expressing data driven code, since it's functional and has pattern matching.
I think for some languages like Clojure and Elixir, it's just so easy to get to the level of abstraction you need to write your business logic that everyone does so. So the code does not have any commonalty with each other. Even when using the same framework/library.
But for Python, JS, etc,... it's the same down to earth abstraction that everyone is dealing with, like the same open a file, parse a csv, connect to the database patterns.
I am finding the most destructive aspect of LLM assistance to be the loss of flow state.
Most of the time I can go faster than these tools if I have confidence in myself and allow the momentum to build up over the course of 20-30 minutes. Every time I tab out to an LLM is like a 5 minute penalty over what I could have done unaided on a good day.
Getting the model prepared to help you in a realistic domain often takes a few minutes of arranging code & comments so that it is forced toward something remotely sane. I'll scaffold out entire BS type hierarchies just so I can throw a //TODO: .... line in the middle somewhere. Without this kind of structure, I would be handling unfiltered garbage most of the time.
It's not that these tools are bad, it's that we need to recognize the true cost of engaging with them. ChatGPT is like a jackhammer. It will absolutely get you through that concrete slab. However, it tends to be quite obnoxious & distracting in terms of its operational principles.
I find the opposite, I can stay in flow state much longer than I could before because tabbing out to an LLM is a tighter loop than me going out to google and searching through sites for smaller tidbits that I have to assemble and get through, rather than a full solution that might not be 100% correct, but most of the time I can get to the working solution quicker.
I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere. He acknowledged that the tools need an expert to use properly, and as he illustrated, he refined his expertise over many years. He is of the first and last generation of experienced programmers who learned without LLM assistance. How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase? I can almost anticipate an interjection along the lines of "well we used to build everything with our hands and now we have tools etc, it's just different" but this is an order of magnitude different. This is asking a robot to design and assemble a shed for you, and you never even see the saw, nails, and hammer being used, let alone understand enough about how the different materials interact to get much more than a "vibe" for how much weight the roof might support.
I think the main difference between shortcuts like "compilers" and shortcuts like "LLMs" is determinism. I don't need to know assembly because I use a compiler that is very well specified, often mathematically proven to introduce no errors, and errs on the side of caution unless specifically told otherwise.
On the other hand, LLMs are highly nondeterministic. They often produce correct output for simple things, but that's because those things are simple enough that we trust the probability of it being incorrect is implausibly low. But there's no guarantee that they won't get them wrong. For more complicated things, LLMs are terrible and need very well specified guardrails. They will bounce around inside those guardrails until they make something correct, but that's more of a happy accident than a mathematical guarantee.
LLMs aren't a level of abstraction, they are an independent entity. They're the equivalent of a junior coder who has no long term memory and thus needs to write everything down and you just have to hope that they don't forget to write something down and hope that some deterministic automated test will catch them if they do forget.
If you could hire an unpaid intern with long term memory loss, would you?
Determinism is only one part of it: predictability and the ability to model what it’s doing is perhaps more important.
The physics engine in the game Trackmania is deterministic: this means that you can replay the same inputs and get the same output; but it doesn’t mean the output always makes sense: if you drive into a wall in a particular way, you can trigger what’s called an uberbug, where your car gets flung in a somewhat random direction at implausibly high speed. (This sort of thing can lead to fun tool-assisted speedruns that are utterly unviable for humans.)
The abstractions part you mention, there’s the key. Good abstractions make predictable. Turn the steering wheel to the left, head left. There are still odd occasions when I will mispredict what some code in a language like Rust, Python or JavaScript will do, but they’re rare. By contrast, LLMs are very unpredictable, and you will fundamentally never be able to mentally model what it achieves.
Exactly. Compilers etc. are like well-proven algebraic properties, you can build on them and reason with them and do higher level math with confidence. That's a very different type of "advancement" than what we're seeing with LLMs.
> If you could hire an unpaid intern with long term memory loss, would you?
It's clearly a deficiency. And that's why one of the next generations of AIs will have long term memory and online learning. Although even the current generation of the models shows signs of self-correction that somewhat mitigate the "random walk" you've mentioned.
There’s a really common cognitive fallacy of “the consequences of that are something I don’t like, therefore it’s wrong”.
It’s like reductio ad absurdum, but without the logical consequence of the argument being incorrect, just bad.
You see it all the time, especially when it comes to predictions. The whole point of this article is coding agents are powerful and the arguments against this are generally weak and ill-informed. Coding agents having a negative impact on skill growth of new developers isn’t a “fundamental mistake” at all.
What I’ve been saying to my friends for the last couple of months has been, that we’re not going to see coding jobs go away, but we’re going to run into a situation where it’s harder to grow junior engineers into senior engineers because the LLMs will be doing all the work of figuring out why it isn’t working.
This will IMO lead to a “COBOL problem” where there are a shortage of people with truly deep understanding of how it all fits together and who can figure out the line of code to tweak to fix that ops problem that’s causing your production outage.
I’m not arguing for or against LLMs, just trying to look down the road to consequences. Agentic coding is going to become a daily part of every developer’s workflow; by next year it will be table stakes - as the article said, if you’re not already doing it, you’re standing still: if you’re a 10x developer now, you’ll be a 0.8x developer next year, and if you’re a 1x developer now, without agentic coding you’ll be a 0.1x developer.
It’s not hype; it’s just recognition of the dramatic increase in productivity that is happening right now.
> How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
LLM's are so-so coders but incredible teachers. Today's students get the benefit of asking copying and pasting a piece of code into an LLM and asking, "How does this work?"
There's a lot of young people that will use LLM's to be lazy. There's also a lot that will use them to feed their intellectual curiosity.
Many of the curious ones will be adversely affected.
When you're a college student, the stakes feel so high. You have to pass this class or else you'll have to delay graduation and spend thousands of dollars. You have to get this grade or else you lose your grant or scholarship. You want to absorb knowledge from this project (honestly! you really do) but you really need to spend that time studying for a different class's exam.
"I'm not lazy, I'm just overwhelmed!" says the student, and they're not wrong. But it's very easy for "I'm gonna slog through this project" to become "I'm gonna give it a try, then use AI to check my answer" and then "I'm gonna automate the tedious bits that aren't that valuable anyway" and then "Well I'll ask ChatGPT and then read its answer thoroughly and make sure I understand it" and then "I'll copy/paste the output but I get the general idea of what it's doing."
Is that what students will do, though? Or will they see the cynical pump and dump and take the shortcuts to get the piece of paper and pass the humiliation ritual of the interview process?
I'm hearing this fear more frequently, but I do not understand it. Curriculum will adapt. We are a curious and intelligent species. There will be more laypeople building things that used to require deep expertise. A lot of those things will be garbage. Specialists will remain valuable and in demand. The kids will still learn to write loops, use variables, about OOP and functional programming, how to write "hello world," to add styles, to accept input, etc. And they'll probably ask a model for help when they get stuck, and the teacher won't let them use that during a test. The models will be used in many ways, and for many things, but not all things; it will be normal and fine. Developing will be more productive and more fun, with less toil.
>How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
Dunno. Money is probably going to be a huge incentive.
I see the same argument everywhere. Like animators getting their start tweening other peoples content. AI is great at tweening and likely to replace farms of juniors. But companies will need seniors to direct animation, so they will either have to pay a lot of money to find them or pay a lot of money to train them.
Well this is actually happening in Japanese Animation and the result is that no young talents are getting trained in the workforce. [1]
But unlike animation, where the demand for the art can just disappear. I don't think the demand for software engineer will disappear. Same thing with musician. Young engineers might just be jobless or on training mode for much longer period of time before they can make actual living money.
Good thing is, as far as I know, Kyoto Animation managed to avoid this issue by having in-house training, growing their own talent pools.
Expecting commercial entities to engage in long term thinking when they can not do that and reduce costs in the next financial quarter is a fools game.
I think what you've said is largely true, but not without a long period of mess in between.
> get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
I wonder this too, as someone who is entirely self-taught, when I started escaping “tutorial hell” was the hardest part of the journey, and took quite a bit of both encouragement and sheer willpower. Not sure I would have ever went beyond that if I had LLMs.
I worry for Juniors, and either we’ll need to find a way to mentor them past the vibe coding phase, or we hope that AI gets good enough before we all retire.
I wonder if that will make the great generation of human coders. Some of our best writers were the generation that spanned between oral education and mass production of books. Late generations read and wrote, rather than memorized and spoke. I think that was Shakespeare’s genius. Maybe our best coders will be supercharged with AI, and subsequent ones enfeabled by it.
Shakespeare was also popular because he was published as books became popular. Others copied him.
Quite a lot of the good programmers I have worked with may never have needed to write assembly, but are also not at all confused or daunted by it. They are curious about their abstractions, and have a strong grasp of what is going on beneath the curtain even if they don't have to lift it all that often.
Most of the people I work with, however, just understand the framework they are writing and display very little understanding or even curiosity as to what is going on beneath the first layer of abstraction. Typically this leaves them high and dry when debugging errors.
Anecdotally I see a lot more people with a shallow expertise believing the AI hype.
The difference is that the abstraction provided by compilers is much more robust. Not perfect: sometimes programmers legitimately need to drop into assembly to do various things. But those instances have been rare for decades and to a first approximation do not exist for the vast majority of enterprise code.
If AI gets to that level we will indeed have a sea change. But I think the current models, at least as far as I've seen, leave open to question whether they'll ever get there or not.
Agree. We'll get a new breed of programmer — not shitty ones — just different. And I am quite sure, at some point in their career, they'll drop down to some lower level and try to do things manually.... Or step through the code and figure out a clever way to tighten it up....
Or if I'm wrong about the last bit, maybe it never was important.
Counter-counterargument; You don't need to understand metalworking to use a hammer or nails, that's a different trade, though an important trade that someone else does need to understand in order for you to do your job.
If all of mankind lost all understanding of registers overnight, it'd still affect modern programming (eventually)
The abstraction over assembly language is solid; compilers very rarely (if at all) fail to translate high level code into the correct assembly code.
LLMs are nowhere near the level where you can have almost 100% assurance that they do what you want and expect, even with a lot of hand-holding. They are not even a leaky abstraction; they are an "abstraction" with gaping holes.
As a teen I used to play around with Core Wars, and my high school taught 8086 assembly. I think I got a decent grasp of it, enough to implement quicksort in 8086 while sitting through a very boring class, and test it in the simulator later.
I mean, probably few people ever need to use it for something serious, but that doesn't mean they don't understand it.
Feels like coding with and without autocomplete to me. At some point you are still going to need to understand what you are doing, even if your IDE gives you hints about what all the functions do.
Sure, it's a different level, but it's still more or less the same thing. I don't think you can expect to learn how to code by only ever using LLMs, just like you can't learn how to code by only ever using intellisense.
> I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere
Some of the arguments in the article are so bizarre that I can’t believe they’re anything other than engagement bait.
Claiming that IP rights shouldn’t matter because some developers pirate TV shows? Blaming LLM hallucinations on the programming language?
I agree with the general sentiment of the article, but it feels like the author decided to go full ragebait/engagement bait mode with the article instead of trying to have a real discussion. It’s weird to see this language on a company blog.
I think he knows that he’s ignoring the more complex and nuanced debates about LLMs because that’s not what the article is about. It’s written in inflammatory style that sets up straw man talking points and then sort of knocks them down while giving weird excuses for why certain arguments should be ignored.
They are not engagement bait. That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.
A lot of people are misunderstanding the goal of the post, which is not necessarily to persuade them, but rather to disrupt a static, unproductive equilibrium of uninformed arguments about how this stuff works. The commentary I've read today has to my mind vindicated that premise.
>> Blaming LLM hallucinations on the programming language?
My favorite was suggesting that people select the programming language based of which ones LLMs are best at. People who need an LLM to write code might do that, but no experienced developer would. There are too many other legitimate considerations.
I am squarely in the bucket of AI skeptic—an old-school, code-craftsman type of personality, exactly the type of persona this article is framed again, and yet my read is nothing like yours. I believe he's hitting these talking points to be comprehensive, but with nothing approaching the importance and weightiness you are implying. For example:
> Claiming that IP rights shouldn’t matter because some developers pirate TV shows?
I didn't see him claiming that IP rights shouldn't matter, but rather that IP rights don't matter in the face of this type of progress, they never have since the industrial revolution. It's hypocritical (and ultimately ineffectual) for software people to get up on a high horse about that now just to protect their own jobs.
And lest you think he is an amoral capitalist, note the opening statement of the section: "Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists in ways that might be hard to appreciate if you don’t work in the arts.", indicating that he does understand and empathize with the most material of harms that the AI revolution is bringing. Software engineers aren't on that same spectrum because the vast majority of programming is not artisinal creative work, it's about precise automation of something as cheaply as possible.
Or this one:
> Blaming LLM hallucinations on the programming language?
Was he "blaming"? Or was he just pointing out that LLMs are better at some languages than others? He even says:
> People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough!
Which seems very truthy and in no way is blaming LLMs. Your interpretation is taking a some kind of logical / ethical leap that is not present in the text (as far as I can tell).
Not a programmer now but my real takeaway from working with developers, enterprises at various stages of adoption: AI / LLM are not magic. The hype still exists but basically AI/LLM are tools. How we use these tools is what makes the enterprises different. unfortunately the quantity hype is '+' by big boys from MS, Goog adding to the noise. Personally, I love to see the quality and most importantly some real examples going beyond coding assistance. Be it automating the QA/QC or even some of the infra
One other thing that might feed into skepticism is that, for the average person who does not actively choose to use LLMs, the most regular LLM interaction is likely to be with AI Overviews in Google search, which are generally a pretty unfortunate experience. For example my last search with an AI Overview, for interpretation of blood pressure reading, resulted in the output: "Systolic pressure (top number): 123 mmHg, which is within the normal range of less than 120 mmHg".
The fact that LLMs can generate this kind of output is not incompatible with the OP's argument about how useful they can be when combined with agents and code review. But search is the lowest-friction way for anyone to interact with LLMs currently, and right now it's a pretty poor advertisement.
I find the Konwinski Prize to be very interesting in this context. 1 million dollars to whoever's open source LLM solves >90% of a set of novel Github issues.
Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.
And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.
Am I misunderstanding or are the models also limited to those that can be run with less than 96 gigs of VRAM?
The models that are both open source and quantized so that they can fit within that much memory are going to be significantly less capable than full scale frontier closed source models, I wonder how the latter would perform.
> Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful.
The best intellisense and code completion tools would solve 0.00. Those were the only tools we were using just a couple of years ago. 0.09 is a tremendous jump and the improvements will accelerate!
I mean, if you can solve 9% of Github issues automatically that's a fairly huge load of work you can automate. Then again you'd have to manually identify which 9% of issues.
I tried to take this article seriously, but it seems more like HN ragebait than an actual attempt at discussion. The engagement bait appears to be working, given all of the comments.
It’s like half of the arguments are designed as engagement bait with logical consistency being a distant concern:
> If hallucination matters to you, your programming language has let you down.
This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.
The thing is, the author clearly must know this. Anyone who uses LLMs knows this. So why put such a bizarre claim in the article other than as engagement bait to make readers angry?
There are numerous other bizarre claims throughout the article, like waving away the IP rights argument because some programmers pirate TV shows? It’s all so bizarre.
I guess I shouldn’t be surprised to scroll to the bottom and see that the author is a HN comment section veteran, because this entire article feels like it started as a reasonable discussion point and then got twisted into Hacker News engagement bait for the company blog. And it’s working well, judging by the engagement counts.
> This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.
I think the author's point is your language (and more generally the tooling around it) should make this obvious, and almost all the AI agents these days will minimally run linting tools and clean up lints (which would include methods and library imports that don't exist) if they don't actively attempt to compile and test the code they've written. So you as the end user should (almost) never be seeing these made up functions.
agree - it is written like clickbait or worse like a sponsored piece.
> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.
really ? what is the author smoking to consider it a solved problem ? This statement alone invalidates the entire article in its casual irreverence for the truth.
I use copilot everyday, and I know where it shines. Please dont try to sell it to me with false advertising.
The article specifically says it's not talking about copilot, but talking about agents that verify the code compiles before they show it to you.
If it uses a function, then you can be sure that function is real.
Was this not clear? The explanation I'm paraphrasing is right in between the line Aurornis quoted and the line you quoted. Except for the crack at copilot that's up at the top.
> Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.
Only if you are familiar with the project/code. If not, you were throw into a foreign codebase and have no idea how to tweak it.
This is one of the most obnoxious things I’ve ever read. “I’m right and every vague straw man I can think of is stupid and naive.” Great. Why did you write this?
I was skeptical at first, but after trying to use it several times for work (Cursor/Claude) I concluded that it's worse than I thought. Lot more time wasted for little to no results. Maybe for boilerplate frontend code or TODO app generation it's good or something?
Anyway in a couple of months I'll try again. Until that time, I guess the search result summaries are nice?
This really doesn't accord with my own experience. Using claude-code (esp. with opus 4) and codex (with o3) I've written lots of good Rust code. I've actually found Rust helps the AI-pair-programming experience because the agent gets such good, detailed feedback from the compiler that it can iterate very quickly and effectively.
Can it set up great architecture for a large, complex project from scratch? No, not yet. It can't do that in Ruby or Typescript either (though it might trick you by quickly getting something that kinda works in those languages). It think that will be a higher bar because of how Rust front-loads a lot of hard work, but I expect continuing improvement.
I agree, the latest models are not bad at Rust. Most issues I have when writing LLM-assisted Rust code are related to dependencies.
There are a lot of v0.x.x crates in Rust with frequent updates and breaking changes. As LLMs "learn" from code in the wild, they are exposed to different way to use the same crate and thus fail to produce working code "from memory".
But as soon as you explain how to use it with the latest syntax, they are able to fix the code.
These systems do work, quite well. Especially the latest release of Claude. Palantir is using AI to process your IRS information without your consent, https://archive.is/QAY9E. This should concern everybody. Not only can this information be used to determine specific account details. They can use this to target individuals which don't align with the current administration's agenda, as well exclude hiring base of political alignment. They could also use this data to exclude and discriminate those against Peter Theil's network state agenda aka "Freedom Cities", but it seems everybody is complacent with that.
You know what, I can use the word “fuck” too, just like this pro coder does: I’m fucking tired of these fucking nuts who act as if there is no problem at all having AI mess with my code. It harms my relationship with my code! It reduces my agency. It threatens my accountability.
He implies that if I am talking about some other flavor of AI than he personally uses than I am just not serious— but that’s what they said six months ago and two years ago. These fucking boosters have already blown their credibility.
If you want to prove that AI agents writing code for you is safe then perform a large scale test and show your results. We’re not crazy for doubting you. You are crazy for not doubting yourself.
> pull in arbitrary code from the tree, or from other trees online, into their context windows,
I guess this presupposes that it is ok for 3rd parties to slurp up your codebase? And possibly (I guess it ostensibly depends on what plan you are on?) using that source code for further training (and generating that same code for others)?
I imagine in some domains this would not be ok, but in others is not an issue.
It's so all-or-nothing this debate. If you're drawing a benefit from using AI tools, great. If you aren't, then maybe don't use them, or try some other approach to using them.
Personally I find AI coding tools situationally useful. I certainly wouldn't use them to write all my code, but I also think I'd be a fool not to leverage them at all.
Sometimes i feel like people who really like AI have a very different experience programming then i do.
They are constantly talking about AI doing all the tedious boilerplate bullshit. Don't get me wrong, some of my code is that too and its not fun. However the pro-AI people talk as if 80% of your day is dealing with that. For me its simply a rare enough occurence that the value proposition isn't that big. If that is the killer app of AI, it just doesn't sound that exciting to me.
When I see someone talk about the reams of boilerplate they’re getting the LLM to write for them, I really do wonder what godawful sounding tools and tech-stack they’re being subjected to.
Exactly. Back in the day, people talked about "design patterns". It took a while for (some of) the industry to recognize that "design patterns" are a sign that your libraries and tools aren't good enough, because you're having to write the same patterns repeatedly.
Unpopular opinion, boilerplate is good for you. It's a warmup before a marathon. Writing it can be contemplative and zen-like and allows you to consider the shape of the future.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
This kind of guilt-by-association play might be the most common fallacy in internet discourse. None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates? How is that a reasonable argument to make?
The most obvious problem with this is it's a faulty generalization. Many of us aren't building large-scale piracy sites of any sort. Many of us aren't bulk downloading media of any kind. The author has no clue whether the individual humans making the IP argument against AI are engaged in piracy, so this is an extremely weak way to reject that line of argument.
The second huge problem with this argument is that it assumes that support for IP rights is a blanket yes/no question, which it's obviously not. I can believe fervently that SciHub is a public good and Elsevier is evil and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code. Indeed, these two ideas will probably occur together more often than not because they're both founded in the idea that IP law should be used to protect individuals from corporations rather than the other way around.
The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.
> The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.
Agree 100%. And generally programmers have a poor understanding of the law, especially common law as it applies in America (the country whose legal system most software licenses have been written to integrate with, especially copyleft principles).
American Common Law is an institution and continuity of practice dating back centuries. Everything written by jurists within that tradition, while highly technical, is nonetheless targeted at human readers who are expected to apply common sense and good faith in reading. Where programmers declare something in law insufficiently specified or technically a loophole, the answer is largely: this was written for humans to interpret using human reason, not for computers to compile using limited, literal algorithms.
Codes of law are not computer code and do not behave like computer code.
And following the latest AI boom, here is what the bust will look like:
1. Corporations and the state use AI models and tools in a collective attempt to obfuscate, diffuse, and avoid accountability. This responsibility two-step is happening now.
2. When bad things happen (e.g. a self-driving car kills someone, predictive algorithms result in discriminatory policy, vibe coding results in data leaks and/or cyberattacks), there will be litigation that follows the bad things.
3. The judges overseeing the litigation will not accept that AI has somehow magically diffused and obfuscated all liability out of existence. They will look at the parties at hand, look at relevant precedents, pick out accountable humans, and fine them or---if the bad is bad enough---throw them in cages.
4. Other companies will then look at the fines and the caged humans, and will roll back their AI tools in a panic while they re-discover the humans they need to make accountable, and in so doing fill those humans back in on all the details they pawned off on AI tools.
The AI tools will survive, but in a role that is circumscribed by human accountability. This is how common law has worked for centuries. Most of the strange technicalities of our legal system are in fact immune reactions to attempts made by humans across the centuries to avoid accountability or exploit the system. The law may not be fast, but it will grow an immune response to AI tools and life will go on.
It's not just "guilt-by-association". It is a much worse reactionary general argument. It can be applied to any kind of moral problem to preserve the status quo.
If this was a legitimate moral argument, we'd never make any social progress.
That whole section seems so out of place. I don't know why he thinks "The median dev thinks Star Wars and Daft Punk are a public commons" either. I don't know why he thinks the entire software engineering profession is about enabling piracy. I suspect Netflix has more software engineers doing the opposite than every piracy service employs combined.
It's not just lazy, it's nonsense. The author is conflating piracy with
plagiarism, even though the two are completely different issues.
Plagiarism is taking somebody else's work and claiming that you yourself
created it. It is a form of deception, depriving another of credit
while selling their accomplishments as your own.
Piracy on the other hand is the violation of a person's monopoly rights
on distributing certain works. This may damage said person's
livelihood, but the authorship remains clear.
I’m a free software developer and have been for over 25 years. I’ve worked at many of the usual places too and I enjoy and appreciate the different licenses used for software.
I’m also a filmmaker and married to a visual artist.
I don’t touch this stuff at all. It’s all AI slop to me. I don’t want to see it, I don’t want to work with it or use it.
Some people make these kinds of claims for ethical reasons, I get it. But be careful to not confuse one’s ethics with the current state of capability, which changes rapidly. Most people have a tendency to rationalize, and we have to constantly battle it.
Without knowing the commenter above, I’ll say this: don’t assume an individual boycott is necessarily effective. If one is motivated by ethics, I think it is morally required to find effective ways to engage to shape and nudge the future. It is important to know what you’re fighting for (and against). IP protection? Human dignity through work? Agency to effect one’s life? Other aspects? All are important.
> and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code.
I think I missed a story? Is GitHub somehow stealing my code if I publish it there under GPL or similar? Or did they steal some specific bit of code in the past?
Copilot was trained on all public code on GitHub and in the early days it could be made to actually vomit code that was identical to its training data. They've added some safeguards to protect against the latter, but a lot of people are still sore at the idea that Copilot trained on the data in the first place.
> None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates?
I don't think that is an accurate representation of the tech community. On the other hand, I do think TFA is making a reasonable statistical representation of the tech community (rather than a "guilt-by-association" play) which could be rephrased as:
The overriding ethos in HN and tech communities has clearly been on the "information wants to be free" side. See: the widespread support of open source and, as your comment itself mentions, copyleft. Copyleft, in particular, is famously based on a subversion of intellectual property (cf "judo throw") to achieve an "information wants to be free" philosophy.
Unsurprisingly, this has also manifested countless times as condoning media piracy. Even today a very common sentiment is, "oh there are too many streaming services, where's my pirate hat yarrrr!"
Conversely, comments opposing media piracy are a vanishingly tiny, often downvoted, minority. As such, statistically speaking, TFA's evaluation of our communities seems to be spot on.
And, now the same communities are in an uproar when their information "wants to be free". The irony is definitely rich.
First, I don't agree that what you just said is at all reflective of what TFA actually wrote. Yours makes it about statistics not individuals. Statistical groups don't have an ass to shove anything up, so TFA pretty clearly was imagining specific people who hold a conflicting belief.
And for that reason, I think your version exposes the flaw even more thoroughly: you can't reasonably merge a data set of stats on people's opinions on AI with a data set of stats on people's opinions on IP in the way that you're proposing.
To throw out random numbers as an example of the flaw: If 55% of people on HN believe that IP protection for media should not exist and 55% believe that GitHub stole code, it's entirely possible that TFA's condemnation only applies to 10% of the total HN population that holds the supposedly conflicting belief even though HN "statistically" believes both things.
And that's before we get into the question of whether there's actually a conflict (there's not) and the question of whether anyone is accurately measuring the sentiment of the median HN user by dropping into various threads populated by what are often totally disjoint sets of users.
I keep talking to people who've had a good bit of success using gemini or cluade to build quick prototype front ends for some applications. I think theres some questions in my head of how well the process scales when you want to keep adding features, but according to them it's not been hard getting it to make the needed changes.
My issue with it is that it gates software development behind paid services with various levels of context supported. Absolutely not the dream I have of how more software should be open source and everyone should be empowered to make the changes they need.
> Think of anything you wanted to build but didn’t. You tried to home in on some
> first steps. If you’d been in the limerent phase of a new programming
> language, you’d have started writing. But you weren’t, so you put it off, for a
> day, a year, or your whole career.
> I can feel my blood pressure rising thinking of all the bookkeeping and Googling and dependency drama > of a new project. An LLM can be instructed to just figure all that shit out.
> Often, it will drop you precisely at that golden moment where shit almost works, and development means > tweaking code and immediately seeing things work better. That dopamine hit is why I code.
This is me . Just yesterday I finished something I have been putting off for 2 years. I maintain 2 mobile phones one because of work and i needed a way to see sms messages from the other phone. I wanted to write a simple andoird app that reads incoming messages and saves them somewhere for me to use for OTP's and other stuff. For the longest time I wanted to build something like this for myself but I have been putting it off just because of all the weird issues i had been getting with mobile development. A couple of days ago since I had some time and I fired up gemini and asked it to write me a simple flutter app for and to save the messages to supabase and voila it was amazing . With some help and some tweaking I have a fully functioning mobile app on andoird , a small tauri application on desktop to read the messages nicely formatted in a data table without much coding on my part at all. Truly remarkable.
I was a bit skeptical at first too, but once I let AI dive into my code, run tests, refactor functions, and even catch its own bugs, it felt like I suddenly had a really competent teammate.
AI has been genuinely helpful for me , it saves time and lets me focus on the things that actually matter.
The author posits that people don't like using LLMs with Rust because LLMs aren't good with Rust. Then people would migrate towards languages that do will with LLMs. However, if that were true, then Julia would be more popular since LLMs do very well with it: https://www.stochasticlifestyle.com/chatgpt-performs-better-...
Does the linked study actually check that the LLM solves the task correctly, or just that the code runs and terminates without errors? I'm bad at reading, but the paper feels like it's saying the latter, which doesn't seem that useful.
I mean, just to steelman the argument, the "market" hasn't had time to react to what LLMs are good at, so your rebuttal falls flat. I think the original statement is more a prediction than a statement of current affairs.
Also, the author didn't say that "ease of use with LLMs" is the _only_ factor that matters. Julia could have other things wrong with it that prevent it from being adopted.
> but the plagiarism […] Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Speeding is quite common too, yet if you get caught — especially overdoing it — you'll have a problem.
Also, in this case, presumably everything produced with AI is fair game too? The argument being made here isn't even "it's not plagiarism", rather "it's plagiarism but I don't care" — why would anyone else respect such an author's copyrights?
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Loved this style of writing in 2005 from Maddox on the best site in the universe or whatever.
Sorry if I don't want google and openAI stealing my or anyone else's work.
The problem is not about quality of the code, it simply almost never works. Just tried some models still in beta on the code of one of the C projects and, while some pieces were okay and could be used as is, some are complete garbage and wouldn't even compile. Maybe few years later it will not become a problem but right now they are useless. For the AI-assisted research and internet search, that's a different thing and works pretty well already.
"If hallucination matters to you, your programming language has let you down."
But every time I try this,this happens. I've tried the same thing on multiple agents and they all make up some magic api that doesn't exist, even when I've used the right thing elsewhere in the code to do the same job with that function commented as such. Asking them to use that always fails.
This seems a massive flaw to me that is just hand waved away here.
I have been using agentic AI to help me get started writing an OpenGL-targeted game from scratch (no engine). I have almost no background experience with computer graphics code, but I understand most of the fundamentals pretty well and I have almost 13 years of software experience. It's just that the exact syntax as well as the various techniques used to address common problems are not in my arsenal yet.
My experience has been decent. I don't know that it has truly saved me much time but I can understand how it FEELS like it has. Because it's writing so much code (sometimes), it's hard to vet all of it and it can introduce subtle bugs based on faulty assumptions it made about different things. So, it will dump a lot of code at once, which will get me 90% of the way there, but I could spend an hour or two trying to nudge it to fix it to get it to 100%. And then I will probably still need to go back and reorganize it, or have it go back and reorganize it. And then sometimes it will make small adjustments to existing, committed code that will subtly break other things.
Something that has surprised me (in hindsight, it isn't surprising) is that sometimes when I feel like it misunderstood something or made a faulty assumption, it was actually me that had the misunderstanding or ignorance which is humbling at times and a good learning experience. It is also pretty good at bug hunting and DEFINITELY very good at writing unit tests.
I count myself as pretty lucky that this domain seems to be very well covered in training. Given the law of averages, most people's domains will probably be covered. I'm not sure how it would fare with a niche domain.
This article feels incredibly defensive. If you have really have a technique that makes you 100x, 50x, or even just 2x more productive, you don’t need to write an article calling people who don’t agree with you nuts.
You keep using that tool, to your advantage. I’d you’re really altruistic you post some videos of how productive you can be like DHH did with his blog in 15 minute videos.
If you’re really that much more productive, the skeptics won’t be able to keep up and it should only take 6 months or some for that to become self evident.
Yes, there are people who think LLMs are just a fad, just like NFTs, and I agree these people are not really serious and that they are wrong. I think anyone who has used an AI coding agent recently knows that they are highly capable and can enhance productivity in the right hands.
But, as someone who gets a lot of value in AI coding agents, my issue is not with gen AI as a productivity enhancing tool - it's with the absolute torrent of BS about how AI is soon going to make coders obsolete, and the way AI has been shoved onto many engineering teams is like yet another incarnation of the latest management fad. My specific arguments:
1. As the author pretty much acknowledges, AI agents still basically suck at large, system-wide "thinking" and changes. And the way they work with their general "guess and check" method means they can churn out code that is kinda sorta right, but often leaves huge holes or outright laughable bugs.
2. Hallucinations are the worst possible failure modes - they look correct, which makes it all the more difficult to determine they're actually bullshit. I shudder to think about who will need to maintain the mountains of "vibe code" that is now being generated. Certainly not fucking me; I had a good career but I think now is definitely the right time to peace out.
3. Even if I could totally agree that there is a strong business case for AI, I can still, as an individual, think it makes my job generally shittier, and there is nothing wrong with having that opinion.
I don't think I'd be so anti-AI if I saw a rational, cautious debate about how it can enhance productivity. But all I see are folks with a vested interest overselling its capabilities and minimizing its downsides, and it just feels really tiresome.
I use LLM as a super Stackoverflow. So far it has been helpful. For code gen, 8 out of 10 cases, the code won't compile and need fixing. LLM did help a lot in learning new topics. Gemini did help tremendously recently in pointing out a new approach to a problem I was stumped on. Though the code it generated was garbage, the code did provide insight on a new approach which I adopted to make it work eventually.
I can’t even get copilot to autocomplete 5 working lines consistently. I spend hours every day arguing with ChatGPT about things it’s hallucinating. And Agents? It took me a year to convince anyone to buy me a copilot subscription. It’s not good enough now? But it was the bees knees just a year or two ago? See I hate the thing where the JS-framework tempo thing happens to the part of the software world I’m in.
His arguments are mostly right I would say. I tend to use a bit of LLMs for coding - just Copilot though - and I do like how it quite often saves me time with autocomplete spanning entire lines.
ChatGPT is also pretty usefull - especially for naming things.
However, on a global scale, I would compare the shift towards 'vibe coding' and massive LLMs use to the one we got when people doing the coding don't even know what Assembly even is. Current state of software is that it's incredibly bloated and slow (and I do love python). Doesn't matter that much since we've got massive improvements in hardware, but still. In 5 years, instead of not really realizing that we're writing slow and bloated code, we will not really realize we're writing incorrect code.
I don't know anything about web developement, but in the environements I've seen during my career - where most code is quite critical and a stupid mistake, one that an LLM could easily hallucinate, can cost a lot - I don't see it replacing humans anytime soon. It's a pretty good and usefull tool though.
I like LLMs, it's legitimately impressive to see people vibe code solutions (some times even to novel ideas). Agents are a much better use of LLMs than auto complete (which is terrible).
However I've tried coding agents, and well they suck... for me. Nearly all of the code I write these days is performance sensitive, as well as design sensitive (making abstractions for others to use), and not close to what could be considered boilerplate. The boilerplate I do write usually takes a trivial amount of time to write by hand. I'll keep trying whatever is new in AI every few months, I certainly don't want to be left behind, but so far that strategy hasn't left me optimistic.
> but the code is shitty, like that of a junior developer
It's worse than a junior. Juniors learn and grow, if they're a good junior it's often fairly rapid and self directed. Juniors can also ask for guidance or help, this is something maybe agents will be capable of in the future, but I could see it going pretty poorly for reasons tied to the first point about not learning. While LLMs contain a ton of information, more than a junior could, the junior is a better reasoning machine; I don't consider LLM "reasoning" to be anything close to the meatbag version.
> but the craft
Craft matters, it's important to enjoy and take pride in your work. If you enjoy managing AI agents that's fine, lots of programmers end up on the manager career path, but it's not for everyone.
I simply do not get this argument about LLMs writing tedious code or scaffolding. You don't need or want LLMs for that, you want libraries and frameworks.
I barely write any scaffolding code, because I use tools that setup the scaffolding for me.
If you're lucky to work in such an environment, more power to you. A lot of people have to deal with React where you need so much glue for basic tasks, and React isn't even the worst offender. Some boilerplate you can't wrap.
I use React at work, there is barely any boilerplate. I actually started a brand new project based on React recently and the initial setup before working on actual components was minutes.
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools.
I'll be damned if I give up control of my machine to a tool that hallucinates actions to take using hastily put together and likely AI-generated "agents". I still want to be the primary user of my machine, and if that means not using cutting edge tools invented in the last 6 months, so be it. I don't trust the vast majority of tools in this space anyway.
> I’m sure there are still environments where hallucination matters.
Still? The output being correct matters in _most_ environments, except maybe art and entertainment. It especially matters in programming, where a 99% correct program probably won't compile.
> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.
No, it's not. It's _the_ problem that's yet to be solved. And yet every AI company prefers chasing benchmarks, agents, or whatever the trend du jour is.
> I work mostly in Go. [...] LLMs kick ass generating it.
I also work mostly in Go. LLMs do an awful job generating it, just as with any other language. I've had the same shitty experience generating Go, as I've had generating JavaScript or HTML. I've heard this excuse that the language matters, and IME it's just not the case.
Sure, if you're working with an obscure and niche language for which there is less training data, I suppose that could be the case. But you're telling me that there is no good training data for Rust, the trendiest systems language of the past ~decade? C'mon. Comparing Rust to Brainfuck is comical.
I won't bother responding to all points in this article. I will say this: just as AI doomsayers and detractors deserve criticism, so does this over-the-top praising. Yes, LLMs are a great technology. But it is also part of a wildly overhyped market that will inevitably crash as we approach the trough of disillusionment. Their real value is somewhere in the middle.
Use of LLMs to program really doesn't make much sense. The very nature of computer software is that it's a small effort from a small number of people that is multiplied out to massive productivity gains for a large number of people. Unless the LLM is producing better output than the people would, you are saving time for a small number of people in exchange for degrading the experience of a much larger number of people. If your AI-coded app goes down for a day and you have to vibe-debug it, that is probably gonna put you in the red from a time perspective.
With the amount of supervision you need to give this stuff, I can't see it being useful for much more than auto-completing the next few lines of code as you type. You've got to read and go over the whole output carefully anyway, so you might as well have finer grained control over it.
It sounds like someone taking a condescending tone to say that "Fast fashion is great! The clothes are really fine, and being able to wear a t-shirt once and then throwing it away instead of washing it is more convenient. This is most definitely progress. Sustainability? Hmm I can't say, that's not my field of expertise, therefore it's probably not important".
I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you. Maybe even the things you're using it for are not trivial or better served by a different tool. That's fine, I don't mind you using a tool far away from my codebase and dependency tree. It has not been useful for me, and it's very unlikely it's ever going to be.
Except that's not the argument people are making. They are arguing it will replace humans. They are arguing it will do research level mathematics. They are arguing this is the start of AGI. So if you want to put your head in the sand and ignore the greater message that is plastered everywhere then perhaps some self reflection is warranted.
You have to learn to filter out the people who say "it's going to replace human experts" and listen to the people who say "I'm a human expert and this stuff is useful to me in these ways".
> I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you.
Indeed but the tedious naysaying that this is arguing against is that AI isn't good full stop. They aren't saying "I tried it and it's not for me but I can see why other people would like it".
I've only just realized how bad AI is for me, in a manner I didn't at all expect. I don't use it. I don't want to use it. I hate it. As another commenter here nicely put it, I want to do, not tell. I like writing code. I like solving problems. I like doing the work myself. I like struggling and then, the next day, finding that my capabilities have grown.
And yet I'm overwhelmingly demoralized and depressed about AI's encroachment on software engineering, so much so that I am finding it difficult to marshal cognitive resources. I'm finding it harder and harder to engage with coding problems, even though I love programming.
The last thing I expected was that AI, even kept at a distance, would rob me of the joy I find in one of the things I love most.
It so devalues what I do and my basic capabilities as a human being that I kind of want to die. I'll stick around for my daughter, my partner, and any future children, but I'm not optimistic at all about the future.
It's enough that I consider peacing out. Not out of life, mind, but out to the woods on a permanent basis. I want zero proximity to the cynicism that runs rampant through the industry because it's so much more difficult to discern useful hype from slop than I could have ever imagined. The bastards are winning.
I think another thing that comes out of not knowing the codebase is that you're mostly relegated to being a glorified tester.
Right now (for me) it's very frequent, depending on the type of project, but in the future it could be less frequent - but at some you've gotta test what you're rolling out. I guess you can use another AI to do that but I don't know...
Anyway, my current workflow is:
1. write detailed specs/prompt,
2. let agent loose,
3. pull down and test... usually something goes wrong.
3.1 converse with and ask agent to fix,
3.2 let agent loose again,
3.3 test again... if something goes wrong again:
3.3.1 ...
Sometimes the Agent gets lost in the fixes but now have a better idea what can go wrong and you can start over with a better initial prompt.
I haven't had a lot of success with pre-discussing (planning, PRDing) implementations, as in it worked, but not much better than directly trying to prompt what I want and takes a lot longer. But I'm not usually doing "normal" stuff as this is purely fun/exploratory side-project stuff and my asks are usually complicated but not complex if that makes sense.
I guess development is always a lot of testing, but this feels different. I click around but don't gain a lot of insight. It feels more shallow. I can write a new prompt and explain what's different but I haven't furthered my understanding much.
Also, not knowing the codebase, you might need a couple attempts at phrasing your ask just the right way. I probably had to ask my agent 5+ times, trying to explain in different ways how translate phone IMU yaw/pitch/roll into translations of the screen projection. Sometimes it's surprisingly hard to explain what you want to happen when you don't know the how it's implemented.
I think in a lot of cases the skeptics haven't really witnessed how good LLMs can be for a subset of software engineering tasks. I find for mid-complexity code they are far, far better than I am at coding. Even though I often tweak their solutions, they outweigh that by often identifying APIs and even whole libraries to call where I fully would have set about implementing it myself - either out of ignorance or an assumption that it would be too hard to implement it myself. They also put in much more error checking than I would normally do. Even if I do implement something myself, it's insane not to at least see what the LLM would do to get ideas. The thing is, once I do that, I nearly always come away with the conclusion that I might as well just use the code it made.
The primary skill a developer hones is the ability to know when something is already solved for you. This is where innate laziness is a gift. Why should I write a fetch function if api.fetch exists? This is a natural game time automatic decision, like a baseball player running for first base soon as the ball hits the bat. The natural thing that should be happening to every developer is the natural realization that the LLM is the ultimate api. I have literally sat here going “well why build anything, this damn thing does this and that for free”. For example, why build a todo app? Any of the LLMs can mostly do this. This is not self awareness, but game conditioning and fitness. You should automatically be feeling the sensation just like someone playing a sport for the millionth time.
Developers thinking AI is a fad is like a left fielder thinking about if they should stick their hand out to catch the ball or not. You can’t think about this.
Anyway, I find myself in stasis trying to actually see what to build given the reality of the power of this api. The South Park episode of “The Simpsons did it” is an apt episode. We are very much at the point where the continuous question will be “Well the LLM can do it”.
I read this article completely thru, twice, and I still have no idea what the author is trying to say. All I can read here is a bunch of sarcastic dismissals of stuff that I don't believe and have no way to evaluate.
I gave chatgpt (the free desktop web version) my Ruby solution to one of my favorite adventofcode challenges and asked it to write in in Rust.
I've no experience in Rust except reading the few first chapters of the book but with the help of the compiler I managed to fix it and it gave the right responses 100x faster than Ruby did. So chatgpt did a great job keeping the logic intact.
I would have done the same thing if LLMs didn't exist, write it in Ruby first and then in Rust. The flexibility and expressiveness of Ruby allows to actually find a working solution. Experienced Rustaceans would do it directly in Rust, but writing in Ruby should be easier and an LLM can transform it.
What worries me about this is that the benefit of well practiced developers using AI to boost their productivity is there, but what happens to the new developers that piggyback off of LLM tooling too much and end up with worse understanding as a whole?
I think this has always been a problem; eg the students who do all their assignments in groups and coast through with a cursory understanding of their degree. Or those who cheat.
If someone doesn't want to learn, there have always been ways to avoid it. Now it's easier with LLMs, that's true. But for people who do want to learn, I think it's also easier now with LLMs.
I'm not sure where the equilibrium will land, in terms of whether we'll end up with more people with a deep understanding, or more people with a shallow understanding. Or maybe the ratio will end up being the same, but there will just be more people in both groups! Ie the ease of learning will pull some of the previously shallow learners into developing a deep understanding, and the ease of coasting will pull up people who were never into development before into being shallow learners.
Where are the results? I keep hearing about how great these coding agents are. What have you built beyond the complexity of a disposable toy app? Where are the swarms of new apps built by these agents that are solving new problems and taking over markets? I’m just not seeing results. I guess I need to spend some time using cursor as an agent to see it for myself.
I am a believer that these tools will be (if they aren’t already) extremely useful. Game changing in many ways, and I worry about the future of knowledge fields. Definitely not a skeptic in the trajectory. I do wonder if that makes those of us who are on the cutting edge more valuable as AI can quickly reproduce things that are well understood.
Please stop this. It’s clear to me that you’re not a software engineer. You weren’t one before AI, and you are not one after. You can do whatever you want, but please don’t push this AI narrative down everyones throat. I’ve had enough of this.
> Professional software developers are in the business of solving practical problems for people with code.
Are we? Is that our job? You might think it is, and it might even feel like it. But it is not. Our job, in 99% of cases, is to make stock price go up. That means making software that people will buy, making people want to keep buying it, and doing whatever makes your investors happy.
So what happens as you use these tools? Your job changes from coder to "LLM manager", but let's assume you're happy with that. You're more "productive", which is an artificial metric and nothing to do with your real job as described above. That's fine for now, but what happens down the road? Now your company is paying more and more money to these LLM services, and paying you less and less.
> "but they take-rr jerbs" So does open source.
This is a gross misunderstanding of open source. Most open source developers are paid to be, full time, by companies. Why? Warm fuzzy feeling? Goodness of their heart? No. It's because it's good for business. These open source techs are a baseline, there' not worth being competitive on. Build the open base in the direction you want, and build the "business value" ontop for profit.
> We used to pay good money for databases.
We still do. Have you looked at managed PostgreSQL prices lately? Also if you haven't noticed, Oracle hasn't gone anywhere.
> LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades.
This isn't the same thing as replacing punch cards, or moving away from Java 6, or replacing artisans with factory machines. This is centralisation and rent-seeking. As companies become more and more dependant on a handful of companies with the upfront capital to build massive server farms... what do you think is going to happen? Once they've convinced enough people that LLMs are essential to their work it's going to cost a lot more than $20/month. Once these companies are entrenched they will wring every last dollar out of every company that's grown to depend on them, many will simply fold.
Maybe we are in a seismic shift in our industry. But the beneficiaries are not going to be whole-hearted LLM adopters, no amount of saying "you need to change how you work" is going to save any of us. The only people that will benefit are going to be the people that own the data centres.
> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.
Translation: Drop your work and current tools every few months to pick up the shiny new thing or you aren't cool or even sane.
Reality: Your favorite cool new tool will be integrated into development work as it is proven, documented, and time allows. This idea that development must leap onto the latest shiny new thing every few months is extremely corrosive to reasonable business practice. The needs of the many outweigh the preferences of the lead developer.
Maybe it's only me, but I just don't write that much code. I try to change less than 100ish lines per day. I try to keep codebases small. I don't want to run a codebase with hundreds of thousands of lines of code in a production environment.
This post feels quite dishonest. Every concern is countered with some variation of ”you’re wrong and I am right”, zero nuance.
I use Claude for specific questions, it works great (but apparently I’m doing it wrong according to the author), code generation doesn’t, I don’t want my boilerplate code to have subtle differences in every file, that’s the worst kind of boilerplate and that’s what I get if I automate it.
There are real world implications of all the infrastructure being built for this too, not to mention the extremely short e-waste cycles during the boom we’re in right now.
To me it also seems like LLMs are the wrong thing to solve for, it’s like a gasoline car, make the engine larger and more fuel consuming to go faster and no matter how much you perfect it, it will still be fossil fuel being burnt at a high rate. What’s the alternative, I don’t think we know yet, but if we were a smarter species we would see that this is not it.
I found this comment especially interesting: "The code in an agent that actually “does stuff” with code is not, itself, AI. This should reassure you. It’s surprisingly simple systems code, wired to ground truth about programming in the same way a Makefile is. You could write an effective coding agent in a weekend. Its strengths would have more to do with how you think about and structure builds and linting and test harnesses than with how advanced o3 or Sonnet have become."
I haven't used coding agents. Can someone who has (or knows about these things) elaborate, or point me to a good resource? Thanks!
It's a while loop. You collect a ton of context (file paths, function signatures, etc.), provide a bunch of functions to the LLM that it can ask to be called and an objective ("Implement X", "figure out X"). Then you send this to the LLM, perform the function calls it requests ("open file", "edit file", "call service X") and append their output to the payload and send it back to the LLM. Rinse and repeat until objective is complete (in which case the LLM may request to call a work_complete function/tool).
That's really it. Stuff the context, loop over the LLM until work is done.
I can sum it up like this: if I could know in advance the exact right thing to build, producing the physical code, has not for a long time, been the bottleneck. I've been vibe coding long before it was cool. It's sometimes called model driven development.
For those that think only procedurally, I can see how it helps them. Because procedural first development has a lot of boilerplate logic.
For those who think model first, the AI may help them rubber duck, but ultimately the physical writing of the characters is minimal.
Most of my time is thinking about the data model. The AI writes almost all of my procedures against said data model. But that is about 20% speedup.
This: "Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine". I want to emphasize this. AI seems like a great 'first draft', but not a 'finished product'. In the hands of a programmer, a first draft is a great start. But pushing AI to frontline workers or customer support fails, as Klarna learned, because a first draft is not good enough.
When I ask chatGTP to generate the code for a M4F MCU that implements the VirtIO code for accessing the GPIO through RPMSG using FreeRTOS it produces two answers that are both incomplete and incorrect.
This is very outside my domain :P I asked ChatGPT to explain the acronyms in your comment and still do not understand it. But I think a reason LLMs might struggle is that there are too many up front complex requirements.
Giving it only 1 or 2 complex requirements at a time, and then having it iterate, I've found to be more effective.
Most LLMs don't "think", so when asking an LLM something, I generally try to think "would I be able to do this without thinking, if I had all the knowledge, but just had to start typing and go?".
I think it broke things down in a weird way, but I definitely can't analyse the correctness of anything it outputs in this domain :P
Coding specific agents like Copilot might be better able to handle a complex initial prompt, since they take the initial prompt, and use LLMs to break it down into smaller steps, which ChatGPT doesn't do. They can sort of "think". Deep research AIs also have a sort of thinking too, so might do better.
I did follow your suggestion to break it down in separate prompts, but that to did not result in any useful code. I did not try to compile it, but I get the impression that it does not even compile. There is nothing remotely related to the VirtIO protocol in the code. See Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...
When I added the prompt: 'Could you use the specification given in Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...' it produced almost the same code preceded with some babbling from the document, but not using anything from the specification, not even the code fragments mentioned in the section.
I was just giving it as an example of how there are domains where AI assistance have little or no benefit. Even if you used them for searching for resources or instructions how to achieve something, they come up with nothing useful, or lead you in the direction you do not want to go. It is just like searching with Google: If you search for the thing that 90% of the people are searching for, your answer will be in the first page. If you search for something outside that, you have to skip many pages, or even not being able to find it at all.
One could also conclude that a large portion of software engineering is rather is mostly implementation of things that have been implemented many times before and that only a small portion consist of real software engineering where you have to develop code for a problem that nobody ever wrote before or that require a deep understanding of the problem domain.
I wonder if "lack of data" problem is solvable within LLM. Humans can apply general programming language knowledge when using a new language, but LLMs have to have lots of concrete source code to work well.
Today these models fall apart very quickly once you step out of js/python/c land to rust/zig/ocaml.
With today LLM tech, feeding it language documentation does not work great.
What this boils down to is an argument for slop. Yeah, who cares about the quality, the mediocrity, the craft... get the slop, push it in, call it done. It mostly works in the golden path, it's about 6 or 7 orders of magnitude slower than hand-written software but that's ok, just buy more AWS resources, bill the client, whatever.
I can maybe even see that point in some niches, like outsourcing or contracting where you really can't be bothered to care about what you leave behind after the contract is done but holy shit, this is how we end up with slow and buggy crap that no one can maintain.
>Yeah, who cares about the quality, the mediocrity, the craft..
Just about no-one in the F100 unless they are on very special teams.
If you care about the craft you're pushed out for some that drops out 10x LOC a day because your management has no ability to measure what good software is. Extra bonus points for including 4GB of node_modules in your application.
It's not much different without the AI. Managers don't care about efficient code, they care about code that meets the business goals - whether that's good or bad is debatable. Agencies duct-taping together throwaway code isn't new. The classic "just buy more AWS resources" & such have been around for quite a while.
Without AI, software is getting worse every year. My question is: will AI resolves that problem, or will it make it worse?
My feeling is that AI will keep the software industry in the same trend: produce more software of lower quality while being more profitable.
People sometimes say "perfect is the enemy of good". But this implies that putting less effort is "good enough". I believe it goes the other way: why would a company put efforts into selling you something good, if it can make you pay the same for something worse? It's all about maximising profit, and AI is no exception there.
"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other."
This is in fact what I am doing (though I am not trying to convince anyone not to use LLMs). How can I get into this other mode? Any pointers? Do I need to pay up for a service? I do not want to carry out vibe coding, I want to use LLMs to solve physics, math and logic problems/puzzles. I'd appreciate any recommendations.
Just have a look at Cursor.ai or zed.dev. Cursor is based on VS Code and integrates different models right into your codebase. It has access to your codebase, knows your code (after indexing) and is able to directly apply changes to your code or run commands on your integrated terminal. You can directly accept/decline changes. You can reference existing code / files. You can link documentation, you can write a default prompt and default settings (previously .cursorrules).
It's faster than opening chatgpt in browser, writing your promt, waiting for the reply, scanning and scrolling through the reply, copying it, inserting it manually, going back to say something doesn't work.
I wonder if it comes down to there being many kinds of programming and this stuff is good for some but not all.
The author describes a world where most code is boring grunt work and most collaborators are mediocre. This world is not familiar to me. I can see LLMs being useful in such a world, but it’s not where I work.
The comment about buying a table instead of building it really stuck with me. We’re the ones who design the table! Saying we should buy one from the store is like saying farmers should buy their food from the store instead of putting in all that work to grow it.
I've been using LLM's pretty regularly for just over a year (my PyCharm AI license just renewed). My experience before that was pretty mixed and fell into the "sort of worth it" category.
It's improved to the point where the summary that I would email a junior developer is a pretty good prompt that produces code pretty close to want I want.
I had a weird project recently where I needed a stand alone event handler that would send an email receipt to the customer, another email to customer service, log the event in an accounting system via an API. I wrote up a proposal for the rest of the management team and, once accepted, pasted the proposal into ChatGPT (and a few others, but ChatGPT gave the best results).
I basically got a full recipe for setting up AWS API Gateway, Lambda, and SES, the Python Lambda function with api calls to SES, suggestions on using AWS SAM, etc. etc. Tweaked the prompt a couple of times and then took over from there. I had this into the hands of the testing team in less than a day.
I'm pretty sure this would reduce my team size in ANY role I've had in my career. Maybe not halve the teams, but it would eliminate a lot of the junior or journeymen developers.
LLM fanatics throw me nuts, instead of learned helplessness, at least try to understand what the token predictor outputs and be honest when you used LLM to generate your code dump when asking for help. Otherwise you are just wasting time and being simply a middle man for a machine and person who expects an interaction with actual human being with their valuable time.
A well articulated blog, imo. Touches on all the points I see argued about on LinkedIn all the time.
I think leveling things out at the beginning is important. For instance, I recently talked to a senior engineer who said "using AI to write programming is so useless", but then said they'd never heard of Cursor. Which is fine - but I so often see strong vocal stances against using AI tools but then referring to early Copilot days or just ChatGPT as their experience, and the game has changed so much since then.
At 0 temperature an LLM is a Map<String,String> - a string input (key) will give you the same string output (value) every time. Hypothesis: there exists a key whose value is a complete, working, fully-tested application which meets your requirements 100% and fulfills your business need. This key is the smallest, most complete description of what your application does. It is written in natural language and represents a significantly compressed version of your application code.
My part-time obsession over the last few months has been trying to demonstrate this and come up with a method for finding these magic keys (I even tried to get the LLMs to search for me, lol). What I really want is to give the latest thinking models (200k input, 100k output) a 5-6 page design doc (4k words, 5k tokens) and have them produce a complete 5kloc (50k tokens) microservice, which would show a 10x compression. It's hard, but I haven't seen any reason to think it wouldn't work.
For better or worse, I think this will be close to what IC jobs will be like in few years. Fundamentally, our jobs are to try work with other functions to agree to some system that needs to exist, then we talk to the computers to actually implement this. If we switch kotlin+compiler for design doc+llm, it still going to be somewhat the same, but far more productive. Agents and such are somewhat of a stop-gap measure, you don't want people giving tasks to machines, you want to accurately describe some idea and then let the computers make it work. You can change your description and they can also figure out their own tasks to evolve the implementation.
Damn. Well I'll spend a few bucks trying it out and I'll ask my employer if they're okay with me using agents on company time, but
But I'm not thrilled about centralized, paid tools. I came into software during a huge FOSS boom. Like a huge do it yourself, host it yourself, Publish Own Site, Syndicate Elsewhere, all the power to all the people, borderline anarchist communist boom.
I don't want it to be like other industries where you have to buy a dog shit EMR and buy a dog shit CAD license and buy a dog shit tax prep license.
Maybe I lived through the whale fall and Moloch is catching us. I just don't like it. I rage against dying lights as a hobby.
It's unfortunate that AMD isn't in on the AI stuff, because they are releasing a 96GB card ($10k so it's pricey currently) which would drop the number you need.
I mean it depends on the model; some people running deepseek report they have better performance at home running on a CPU with lots of ram (think a few hundred gigabytes). Even when running locally vram is more relevant than the performance of the GPU. That said I'm really not the person to ask about this, as I don't have AI agents running amuck on my machine yet
You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.
An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)
DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.
Everything works right until it doesn't. LLMs are trained on things that have worked. Let's revisit in 2027 when things are insanely faster, but not much better.
I think I figured out a big problem I have with AI, and that is that I just hate using other people's software. The vast majority of software is just garbage. It all tries to take way too much of my attention. It doesn't help that all these AI tools are new, so they are inextricably linked to some SV startup that wants to be the next unicorn, so they have some bullshit account-generation/data-retention/terms-of-service/acceptable-use-policy/remarketing-email-campaing song and dance that is all stuff I have to pay attention to. All I really want to do is sit down and think on my own. I'm constantly inundated with distractions at work, so when I get a few minutes to myself, I sure as hell don't want an AI bot messaging me about reviewing its PR. I don't even want to review the human-written PRs; just push your code already, if it breaks you get to fix it. I certainly don't want a 500th unread email in my inbox.
You know what doesn't constantly nag for my attention? Dead text files on disk. They just sit there, waiting for me to grace them with my presence. That's the way it should be. Computers serve me, not the other way around.
So the reaso we should not use Rust (a memory safe language that gets rid of categories of bugs) is because some new AI tools are not read?
Using AI for over a year now on a daily basis, most AI tools do not struggle with Rust, they struggle to provide value. If I do not instruct them to keep the output to the bare minimum, I usually end up with 10x the output what would be a simple solution to my problem.
I think for most people that do not get value out of AI assisted coding, its a skill or learning issue. I don't care how good of a developer you consider yourself to be. If in 2025, these tools do not provide you significant (to almost absurd) value, you just do not know how to use them, or are to stubborn to push past the initial learning curve. Alternatively, maybe... programming is changing so much with AI, that you just might not see what's happening right infront of your eyes because its so foreign to all the other paradigms in the past.
In 2025 I am about 800-900 hours deep into using Cursor, blowing through 500 credits each 2 days (500 credits = 20$). I also organize a local "vibe code" workshop and meetup, with people from all sorts of experience levels, from no coding experience, to twice my senior. And what I am seeing is that what separates those from those who are good at using AI and those who are not, generally is a level of rigid stubbornness.
I can't wait for time to do its course here and those people who now are "all nuts" to slowly see programming of the past become an old relic.
The premise of “lol, I’m just a dumb developer and just wait until the smart developers catch on” falls extremely flat coming from Thomas.
It kinda invalidates the entire premise because it builds on this false foundation of “look how useful it is to a mere plebeian” and the skills actually required to sift through LLM garbage are what puts you in the top 5% of programmers.
Most senior developers cannot conduct useful code review and spot errors. In every company I’ve been at, there were only 1 in 20 people that would give the useful in-depth reviews required to pick up LLM bullshit.
This style of engagement bait is brutal and it severely overshadows other nuggets of wisdom of how LLMs can be useful.
There’s a fine line between being an AGI skeptic and an AI skeptic.
While I don’t believe the “AI will take all jobs” VC tales, I also think not using LLMs in software development workflows is similar to not wanting to use Excel as an accountant in the 1990s.
AI won’t replace all jobs, but companies that cleverly integrate it into their processes will be faster and more efficient compared to those that don’t.
FWIW with proper MDC/ rules I’ve found LLM programming agents excellent at rust. There’s a lot of complex and tedious minutiae in rust that I know but forget to apply everywhere it’s helpful while a SOTA LLM agent does well, especially with proper rule guidance to remember to use it.
Generally though I find LLMs have a pretty rapidly diminishing return on what you can expect out of them. They’re like a 3-5 year senior programmer that has really learned their domain well, but doesn’t have the judgement of a principal engineer. You get to a point where you need to reach in and right things and really pay attention, and at that point the diminishing returns set it rapidly and you’re better off just doing the rest yourself. Refactors and stuff can be delegated but that’s about it.
I find this true regardless of the language. None the less, I’ve been able to improve my overall velocity dramatically completing several projects in the last few months in the span of one typically. If tooling improves I hope to continue that but I’m already getting close to the limit of how fast I can conceive of useful creative things.
> but the craft - Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.
That is not my objection, though some seem to think it is.
I got into coding because I found it interesting and fun. I've avoided managing others because I hate that sort of thing. Working with an LLM, prompting it to do work then assessing that work and adjusting the prompts or cleaning/completing the output myself, is far too much like outsourcing work to a 3rd party junior, and I don't want to do that whether said junior is a human in a far off land (or just down the street, for all I know!) or some LLM run by MegaCorp-ᴙ-Us.
I was already seriously disliking much of the modern programming industry (remote teams, the shit-show that client-side web work continues to devolve into, and much else) anyway. If it gets to the point where I can't do my job (or can't do it competitively enough) without using LLMs on top of dealing with the other things I don't care for, I'll be retraining for something else entirely. Heck, I've got the mortgage paid, I don't have or plan to have a family beyond the cats, I could just about survive on shelf-stacking…
I know it isn't a passing fad, though I'm convinced the current and near-future iterations aren't going to be quite as significant in the longer term as many believe. I don't object to others using it (aside from the matter of “MegaCorp-ᴙ-Us”s of the world suddenly deciding that piracy is just fine because they stand to benefit from it, in fact are already benefitting from it, in this case). But I don't plan to be nagged into using it myself.
> Do it on your own time.
If/when that career shift happens, or I survive as I am long enough to retire, I will probably do exactly that. As it stands I spend more than enough of my life on tech in DayJob, my own time is increasingly filled with unrelated hobbies and people.
A lot of the doubters are basically saying: "If it's so great, show me proof." While that is usually the right stance, I think it is short-sighted in times of revolutionary change.
To be sure, you definitely should not blindly trust the people who have a stake in selling AI products. However, you probably should trust the people who have no stake in selling AI, but are using it to accelerate themselves. The problem is, these people are probably too busy building things to spare any time convincing you. And in fact, it is a competitive advantage for them if others don't figure it out.
Here's my take: Everyone needs to figure out how AI works for them. The thing with AI is that it is a technology that behaves like people. It has the speed, efficiency and scale of computers but the fallibility and quirks of humans. This is why it's so confusing to discuss, and why folks have such varying experiences. The right approach to working with AI is like working with people: understand what they're good at and where they are weak, and then work with them to achieve your goals.
This will require using it "in anger" on many non-trivial tasks over a significant period of time, which will take persistence and patience. Yes, the hype is so over-the-top that you might not want to invest your valuable time on it. But you owe it to yourself to ensure you can fully capitalize on the sea change that is happening.
If it helps, I (like the OP) have no stake in selling AI and I posted this comment about my experiences on the other thread about the AI coding -- https://news.ycombinator.com/item?id=44163543 -- The upshot is, AI has enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. And I've not even used agents yet!
> Everyone needs to figure out how AI works for them.
That is the real takeaway.
I use LLM’s for dozens of things. It took time to find how model strengths best connect with mine.
People who only evaluate models for one purpose over short timespans are going to have a hard time.
It’s like models come from a different cognitive & communication culture. Eliminating that barrier takes more experimentation and experience than logic.
I think their creativity is also often a function of the prompters creativity.
The harder you (creatively) push them into a creative corner, the more genuinely creative they get.
Not unlike how a good teacher can get unexpected positive results from students when they are channeled and pushed.
I believe that AI is very useful in software development but I don't buy the narrative that AI is responsible for layoffs over the past few years (at least not most of them). I think that narrative is a convenient cover for systemic misallocation which created a need to contain inflation. I think big tech execs understood that, beyond increasing their company stock prices, they also need to work together to keep the monetary system itself under control. This is why they've been firing people whilst having record profits. They've reached such scale and the system has reached such fragility that they have to think and act like economists to keep the thing going. The economy itself has become the responsibility of big tech.
But who knows, maybe AI will accelerate so rapidly that it will fix the economy. Maybe we'll have robots everywhere doing all the work. But I worry about the lack of market incentives for people to adapt AI to real world use cases.
For example, I'm an open source developer who likes to tinker but I've been booted out of the opportunity economy. I can't afford to program robots. People like me are too busy using AI to parse spreadsheets and send targeted ads to even think about automating stuff. We work for companies and have no autonomy in the markets.
If things had worked out differently for me, I'd probably own a farm now and I'd be programming robots to do my harvest and selling the robots or licensing the schematics (or maybe I'd have made them open source, if open source had worked out so well for me). I don't have access to such opportunity unfortunately. The developers who worked for big tech are good at politics but often disconnected from value-creation. Few of them have the skills or interest to do the work that needs to be done now... They will just continue leveraging system flaws to make money, so long as those flaws exist.
If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening. Control over intellectual property has given us fantastic things—Linux, Blender, etc.—and you can’t have GPL without an ability to defend copyleft. IP laws provide that ability.
One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly ethically corrupt. The ends do not justify the means. Just like refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed, everyone makes this decision for themselves.
I feel silly asking, but is this reality or hyperbole?:
> An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR.
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
I'm working better and harder with Roocode in Code mode (and somtimes Architect). I've only been doing it for a few weeks, and it reads to me like I'm already using an 'old approach'.
I've never been a completionist. I'm distracted. I like novelty. I make half-form experiments, but mostly provide deeply immersive conversations for the benefit of a small group of people around me, some of whom go on to make/do things that I proudly feel my influence on. The final 10% of work often defeated me.
Now, I just make it happen. The pain is gone. Something else can carry the last 10%, and my ideas get realized by me (and AI).
I can't even conceive how this will change the world en masse. Knowledge domains (not just tech, but science, art, etc) have been built of the things that the type-A completionists knew how to see. The others didn't finish, and what they saw of the world has largely been marginalized.
The mind of ADHD and creative types is about to get much more represented in the output products of civilisation. This is going to be a big recalibration, and I think a re-balancing toward something more healthy tbh
Can we get a video of a workday conducted by these people?
Unless there's a significant sense of what people are working on, and how LLMs are helping -- there's no point engaging -- there's no detail here.
Sure, if your job is to turn out tweaks to a wordpress theme, presumably that's now 10x faster. If its to work on a new in-house electric motor in C for some machine, presumably that's almost entirely unaffected.
No doubt junior web programmers working on a task backlog, specifically designed for being easy for juniors, are loving LLMs.
I use LLMs all the time, but each non-trivial programming project that has to move out of draft-stage needs rewriting. In several cases, to such a degree that the LLM was a net impediment.
Not exactly what you're asking for, but https://news.ycombinator.com/item?id=44159166 from today is not a junior web programmer working through the backlog, and the commit history contains all the prompts.
Sure, thanks. I mean it's a typescript OAuth library, so perhaps we might say mid-level web programmer developing a library from scratch with excellent pre-existing references, and with a known good reference API to hand. I'd also count that as a good use case for an LLM.
I have done everything from architecture design for a DSP (Qualcomm), to training models that render photos on Pixel phones, to redoing Instagrams comments ranking system. I can't imaging doing anything without LLMs today, they would have made me much more productive at all of those things, whether it be Verilog, C++, python, ML, etc. I use them constantly now.
I use LLMs frequently also. But my point is, with respect to the scepticism from some engineers -- that we need to know what people are working on.
You list what look like quite greenfield projects, very self-contained, and very data science oriented. These are quite significantly uncharacteristic of software engineering in the large. They have nothing to do with interacting systems each with 100,000s lines of code.
Software engineers working on large systems (eg., many micro-services, data integration layers, etc.) are working on very different problems. Debugging a microservice system isn't something an LLM can do -- it has no ability, e.g., to trace a request through various apis from, eg., a front-end into a backend layer, into some db, to be transfered to some other db etc.
This was all common enough stuff for software engineers 20 years ago, and was part of some of my first jobs.
A very large amount of this pollyanna-LLM view, which isnt by jnr software engineers, is by data scientists who are extremely unfamiliar with software engineering.
I write embedded firmware for wireless mesh networks and satcom. Blend of Rust and C.
I spent ~4 months using Copilot last year for hobby projects, and it was a pretty disappointing experience. At its best, it was IntelliSense but slower. At its worst, it was trying to inject 30 lines of useless BS.
I only realized there was an "agent" in VS Code because they hijacked my ctrl+i shortcut in a recent update. You can't point it at a private API without doing some GitHub org-level nonsense. As far as my job is concerned, it's a non-feature until you can point it your own API without jumping through hoops.
You used one AI tool that was never more than autocomplete a year ago and you think you have a full hold of all that AI offers today? That's like reviewing thai food when you've only had Chinese food.
> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument.
Yes we are, because the kind of work you need to do in C++ or Rust is probably entirely different from the work this person manages to get the LLM to do in Go.
The most important thing in this article in my mind is in the level setting section - if you are basing your perspective on the state of AI from when you tested it 6mo+ ago, your perspective is likely not based on the current reality.
This is kind of a first though for any kind of technology. The speed of development and change here is unreal. Never before has a couple months of not being on top of things led to you being considered "out of date" on a tool. The problem is that this kind of speed requires not just context, but a cultural shift on the speed of updating that context. Humanity just isn't equipped to handle this rate of change.
Historically in tech, we'd often scoff at the lifecycle of other industries - Airlines haven't changed their software in 20 years?? Preposterous! For the vast majority of us though, we're the other industry now.
>pull in arbitrary code from the tree, or from other trees online, into their context windows,
run standard Unix tools to navigate the tree and extract information,
interact with Git,
run existing tooling, like linters, formatters, and model checkers, and
make essentially arbitrary tool calls (that you set up) through MCP.
....for the vast majority of my career, anyone who suggested doing this - much less letting code that no one in the world (much less the company) truly understands the logic flow of do this - would be fired.
> You’ve always been responsible for what you merge to main.
I've never merged anything with an unclear copyright to main, let alone if I positively suspected that it had been plagiarized.
> Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Every place I've been employed was nearly paranoid about copyright, with rules in place about never copying and pasting anything copyrighted into the company code base.
All the major open source distros and projects, likewise.
There are some bad apples out there, like, oh, Sam Altman, who wants massive plagiarism to be fair use.
So, au contraire, you can shove your contemptuous blog up your ass, buddy.
I know this is a website for programmers and I am not one. I like LLMs for quick answers and simple tasks like proofreading (just to fix errors, not to add or change anything) and simple text formating. I absolutely hate for everything else. I am a writer and LLMs prose may be correct and sometimes whimsical but it has soulless quality to it that makes me forget about what I just read after five seconds. My friends use Meta AI on our chat groups on WhatsAppp and I hate it so much, I delete every single AI message as soon as I see it.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Seriously? Is this argument in all earnestly "No profession has been more contemptuous therefore we should keep on keeping on"? Should we as an industry not bother to try and improve our ethics? Why don't we all just make munitions for a living and wash our hands of guilt because "the industry was always like this".
This op ed suggests that it’s easier to audit a huge amount of code before merging it in than is to write the code from scratch. I don’t know about anyone else, but I generally find it easier to write exactly what I want than to mentally model what a huge volume of code I’ve never seen before will do?
(Especially if that code was spit out by an alien copypasta that is really good at sounding plausible with zero actual intelligence or intent?)
Like, if all I care about is: does it have enough unit tests and do they pass, then yeah I can audit that.
But if I was trying to solve truly novel problems like modeling proteins, optimizing travel routes, or new computer rendering techniques, I wouldn’t even know where to begin, it would take tons of arduous study to understand how the new project full of novel algorithms is going behave?
The author mentions that smart programmers kick off agents, then go for a coffee and wait to be notified.
But agents like Claude Code and Cursor run on a user’s machine right? So how can they work in the background like that? are there agents that can work async?
Fun article, although I'd say I'm maybe slightly more on the skeptical side.
I've found LLMs can't really write a whole program there isn't already a template for. They're good enough at writing individual functions, and as the author mentioned, their breadth of knowledge is better than any individual.
Honestly, I see them as a supercharged version of googling, without all the out of context answers you get from Google since they also place ads and whatnot. I find an LLM can get me an answer far quicker than a regular internet search, even if it's imperfect and I'm probably not going to put it straight into my code.
Finally, a proper good take from someone (seemingly) approaching it rationally and objectively, rather than blindly sitting on either side of the hype fence. LLMs are good tools, and even I, a decidedly-not-software-engineer, have been increasingly tinkering with locally-run tools to create applications and tooling that meets my own needs. The one thing that’s held me back from publishing has been the IPR angle, but I quite like this line on that topic:
> But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
The OP isn’t wrong. Some of the most brilliant minds in computer science have built tools to shatter DRM, enable widespread piracy, and repeatedly lift code wholesale from public repos if it helps them move a product forward. The modern crop of technologists (myself included) have long had a hypocritical view on IPR: “rights for me, not for thee” to oversimplify things. The entirety of the modern public cloud is built by trodding on the backs of open source projects while wearing lawn-aerating spikes. We steal shit from others so our Founders and Masters can reap billions in profit; to decry plagiarism now feels incredibly dishonest and insincere, at the very least.
Look, I’m an AI skeptic myself. On the whole, these tools are bad. They’re taking jobs, polluting the planet, and dumbing down a society that still can’t try rebooting something when it fails, let alone identify where their WiFi ends and the internet begins. They’re toxic walled gardens that can reauthor reality to the whims of the VCs and Billionaires funding them, locking you into subscriptions forever.
But even I will admit that modern LLMs, when it comes to writing code, are actually really good for the majority of C-tier devs and below. Are they as helpful to bleeding-edge engineers on the cusp of the future? Nah, not really, but if you’re just writing a checklist app for iOS to help you stay on task, or a Packer manifest to lay out some VM templates, they’re kinda awesome.
I think of programming languages as an interface between humans and computers. If anything, the industry expanded because of this abstraction. Not everyone has to learn assembly to build cool shit.
To me AI is the next step in this abstraction where you don't need to learn programming languages to potentially build cool projects. The hard part of software engineering is scale anyways. My bet is that this will expand the industry in unprecedented ways. Will there be contraction of traditional programming jobs? Absolutely. The growth in tech jobs over the last 20 years weren't more assembly programmers. They were abstraction experts.
I'm sure the next wave will be even bigger, professional prompting will explode in size.
The C abstracting the assembly or the GC a abstracting away memory management work because they were possible to implement in a deterministic and reliable way (well, in the case of garbage collection, not all the time)
But I don't think that's a similar situation for LLMs, where the hallucinations or failure to debug their own issues are way too frequent to just "vibe code"
He forgot "but the 15 year old behemoth codebase that doesn't fit in any context window and isn't well documented and changing of which requires changing tons of ancillary files not easily connected to each other logically".
That's what a lot of the highest paid programmers work on. I work in such a firm and the agent they have for us is incapable. And it's the latest and greatest from whoever coders love to rave about online.
I love that AI written tests and other generally greenfield things work well and can be vetted quicker than writing and getting the code oneself. But for most more complex tasks it's not there. I'm ready when it is though
"If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that."
Thus sucking up all the time you "saved" by having the LLM write the code you saved. The machine made the easy parts of coding easier while making the hard parts harder.
"You’ll spend 5-10 minutes knocking it back into your own style."
As if style is the only thing that matters?
"LLMs are showing signs of adapting to local idiom, but we’re not there yet."
First step fallacy. Ladders are showing signs of being tall enough to reach the moon, but we're not there yet.
For some reason, every hardcore AI proponent I hear from resorts to gaslighting and namecalling to convince everyone it's so useful for everything.
When that doesn't work, they start inventing hyper-specific scenarios and graphs with grounding that exists only in their imagination to push the topic further. If the AI can't solve a problem in a domain for you, well, it's that domain that's the problem, not AI. You simply need to re-invent the paradigm around your entire problem - easy stuff.
Finally, when they've failed to convince anyone with a modicum of rationality, they resort to some version of "well, fuck you, it's the future and I'll just be leaving you behind!"
I want to skip all the debate and just have someone tell me what setup I need to get to here:
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
LLMs are a tool, a very precious one if you know how to use it.
I've seen Antirez use it for hard to find bugs in the Redis codebase (videos are on YouTube, albeit Italian), I've seen Cursor refactor in a matter of minutes code that would've taken hours, correctly, I've seen security researchers leverage it to find complex edge cases, etc
My 2 cents is that developers that won't learn to use the tools will be left behind (albeit not for long imho), but also, that sales people promoting the idea that devs can be replaced equally dangerous.
>It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.
Are you kidding me? Mosaic was released in 1993. In 1998 Apple released the iMac on a theory, based on some marketing research, that people might be really interested in "getting online." Five years. And that's about a decade before we realized we might want a touch phone. More than a decade before an App Store.
This is very much still the top the first. AI will make even the internet seem small. Which makes sense — the entire whole internet is merely one input into this technology.
My current workflow with Codex is (coding environment from OpenAI):
(1) Ask to write an implementation plan for a specific change or a feature. It will go through the source code, look up references, make notes and produce a plan
(2) Review the plan. Point out missing things, or stuff that needs improvement.
(3) Once I'm satisfied with the plan - ask to draft PR. Launch a few attempts in parallel and pick the one that I like the most.
(4) While drafting PR, Codex will run unit tests (even can run E2E tests in its container), linting and type checkers at every single step. This helps a lot with the stability.
(5) I review the code and merge the PR if I like it. Ask to cleanup - if not.
This feels like working with a remote team - very patient and diligent at that.
Ultimately, I get to get more features done per day. But I also feel more tired by the end of the day due to a higher level of cognitive load. There are more decisions to make and less idle time (e.g. no more hours spent tidying up the code or doing relaxing and pretty refactoring).
TLDR; this AI thing works really well at least for me. But it comes with trade-offs that might slow down its adoption by companies en masse.
I do use LLMs for coding and the newer models have definitely been a blessing. I don't know about using coding agents (or agentic coding) though. I personally do not find this better than chatting with the llm, getting the code back and then copy / pasting it and grokking / editing it. The author of this seems to suggest that.. there is one correct flow, his flow (which he doesn't entirely detail) and everything else is not appropriate. He doesn't go into what his process is when the LLM hallucinates either. Not all hallucinations show up in static analysis.
Yes! It's amazing how even in a field that tends to lean more early adopter than average you still get a lot of the default knee-jerk dismissal and cynicism - even when it's something clearly amazing and useful as thinking machines.
We're in the middle of a major shift - there will benefits to those that adapt first. People outside the field have no idea what's coming, even those of us in the field are underestimating the shift.
There were a few outliers in the 60s who understood what the computing revolution meant and would mean, but most did not. This is likely an even bigger change than that.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write.
For most software projections, they cannot even remotely do that - at least not today and as far as I can tell.
And that is to say, it's not that they can write a not-so-large fraction. They cannot write any significant fraction of it. They can often write you a draft that helps you get there faster, though. They can also facilitate web search result consolidation etc. which can also help. But - they can also lead you down wrong paths and dead-ends, on occasion.
"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing"
I am actually doing this the whole day long. For example i have setup today a fresh new debian vps for some interns. U had to provide them with a docker system, support for go, nginx stuff and i made a quick hello world app in angular with a go backend. I could have done it myself. But i asked chatgpt to provide me with all the commands and code. No idea how an agent could do this for me. I got everything running in like 30 minutes.
If you’re leaning out, spend two weeks leaning in.
I did, and learned a ton, and likely not going back to how I was before, or how I used it a week ago.
The comments in the article about not reading the agent is good but it’s more than that…
Vibe coding is for non-coders. Yet, you get a feel for the vibe of the AI. With windsurf, you have two or three files open, and working in one. It starts smashing out the multi, interspersed, line edits and you know with a flutter of your eyes, it’s got your vibe and correctly predicted your next ten lines. And for a moment you forgive it for leading you astray when you read what it said.
I don't doubt that LLMs will have a large impact on software and programming. But I do wonder whether it will be a positive impact. I think one of the biggest problems the software industry has is that it's far too rare for the people building software to have deep explanatory knowledge about what they have built. LLMs threaten to make this problem vastly worse than it already is.
the reason this is rare is that no one stays on any one project long enough for this to be a reality anywhere. you know anyone working 15-20 years on the same project? I’ve been on many projects where deep explanatory knowledge existed through and through until the project was initial-code-complete and then people left to do other amazing things leaving project with people with limited knowledge of it. I suspect this is the case for vast majority of software
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
I think this is my key point of disagreement. Professional software developers often don't see themselves as artisans, because Steve Jobs's unseen feet are implemented through our implicit standards of ourselves. I don't go out of my way to be a perfectionist on the job, but neither will I try to push 1000 lines of spaghetti to solve a minor problem that seems like it ought to be a simple fix.
I've worked with people in the past who don't have this instinct, and it's my experience that their productivity is quite low if not negative, even though they often perceive themselves to be 10x developers who can handle all kinds of edge cases super fast. I suspect that most AI-generated code today ends up in this niche.
Here's what to do: Show me a video of LLM fixing four filed issues in the KiCad codebase.
If you do that, I'll swallow my AI skepticism.
I would love to have an LLM that I can turn loose on an unfamiliar codebase that I can ask questions of. I would love to have an LLM that will fill in my Vulkan boilerplate. etc.
I use emacs and Mercurial. You can demonstrate magic to me and I can be convinced even if it's not mainstream.
Rewriting Javascript slop to StackOverflow standards is not convincing me.
Get to it.
(The OAuth stuff posted earlier certainly moved my needle, but the fact that they needed a gaggle of reviewers as well as hand holding when the LLM got stuck mutes the impact significantly.)
i am no clearer on what setup the author is using? are you hand coding all these non-code agents you mention? You talk about zed, then about cursor - don't they both do the same thing, or are you using 2 IDEs to write your code? What tooling is used for these async agents that are doing things while i eat my breakfast?
I really want to like AI assisted coding, but I guess i am just stuck as to what to do.
I have tried co-pilot, both 2 years ago, and recently. I have tried some other AI plugin on VScode (i thought that was cursor, but I don't see it in the extensions when i now search, so it was probably something else). If anything, I have found with these solutions that they are great for the first few tasks - generate these models, scaffold this, add oauth, etc - but after that they lack all consistency with the code they themselves generated (like if i add a new model, the output is wildly different to what it already created) and it takes more time to rewrite that output than to copy-paste and change whats already there.
@op - could you explain exactly what tooling you are using, maybe give an example of these async agents, etc? Would be most appreciated for us luddites who would love to be more productive.
I agree wholeheartedly with this piece, but I think it might've been taken better if it wasn't written in the style of an HN comment. That being said, definitely doesn't deserve so much flaming in the comments, sorry to see that, because it's basically correct, with some small things I disagree with here and there. Feels exciting to be on the precipice of having a Star Trek style spaceship computer at my fingertips.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write.
But, you still have to read it:
> Reading other people’s code is part of the job...I have to read the code line-by-line anyways.
So instead of writing the tedious code, I only have to read it. Oh but don't worry, I don't have to read it too carefully because:
> Agents lint. They compile and run tests. If their LLM invents a new function signature, the agent sees the error
But remember...
> You’ve always been responsible for what you merge to main.
So now I have to oversee this web of agents and AI ontop of coding? Am I doing more now for the same pay? Am I just speedrunning myself toward lower pay? Is AI adoption a prisoner's dilemma toward lowing my wages hardest?
Because is good at coding compared to many other disciplines (e.g. math), it makes the internal AI politics among programmers more of an issue. Add fuel to that fire baby!
I've observed that AI excels at creative work, but not exacting sciences. For example, If I say, "Write the code to put the button in the upper right corner," I'll almost certainly get a button in the upper right.
Conversely, if I say, "Put the button in the upper right corner exactly like the existing button in the upper left," I'll get a button in the upper right, but not exactly like the upper left button.
This is the same as every other transformative tool which has come to market. You've got a certain segment of the population who will embrace it and push the boundaries of what is possible with the current tools and those in the future. And you've got the curmudgeons who won't let go of the one tool and one non-cloud hosting solution they so desperately cling to and deny the utility of anything else.
AI makes me sad. When I started my CS degree, I didn't even know what silicon valley was. I was unaware of what the SWE job landscape was like. I went to school in a no-name town.
Computer science was an immensely fun subject to learn. I moved to one of the big cities and was bewildered with how much there was to learn, and loved every second of it. I gradually became good enough to help anyone with almost anything, and spent lots of my free time digging deeper and learning.
I liked CS and programming - but I did not like products built by the companies where I was good enough to be employed. These were just unfortunate annoyances that allowed me to work close enough to what I actually enjoyed, which was just code, and the computer.
Before LLMs, those like me could find a place within most companies - the person you don't go to for fast features, but for weird bugs or other things that the more product-minded people weren't interested in. There was still, however, an uncomfortable tension. And now that tension is even greater. I do not use an LLM to write all my code, because I enjoy doing things myself. If I do not have that joy, then it will be immensely difficult for me to continue the career I have already invested so much time in. If I could go back in time and choose another field I would - but since that's not possible, I don't understand why it's so hard for people to have empathy for people like me. I would never have gone down this path if I knew that one day, my hard-earned-knowledge would become so much less valuable, and I'd be forced to delegate the only part of the job I enjoyed to the computer itself.
So Thomas, maybe your AI skeptic friends aren't nuts, they just have different priorities. I realize that my priorities are at odds for the companies I work for. I am just tightly gripping the last days that I can get by doing this job the way that I enjoy doing it.
LLMs don't make your hard-earned-knowledge less valuable: they make it more valuable.
You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise.
If you don't want to do that then I guess you can find another career - but if you switch careers because you incorrectly think that LLMs make programming experience less valuable you would be making a big mistake in my opinion.
I agree with your assessment of the value of the skills, at least for the time being. What I dislike is the way that we are being encouraged to work now. I simply do not find any joy, at all, in reviewing LLM-written code and then typing in the next prompt.
A sentiment I see often is that it's work, it's not supposed to be fun, and you work at the pleasure of the employer. And I accept that. But I still am really just crushingly sad that this is what my job is becoming.
In the article, Thomas wrote:
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.
I see this as a massive downside, because I loved writing tedious code. I loved reading docs on something I previously didn't understand. I loved forming the mental models strong enough to say "yeah I see why that's there" in the previously-inscrutable APIs of the frameworks and such that I was using. It was precisely the _way_ that I approached that work that allowed for that knowledge to accrue. It was because I almost never just copy/pasted something without spending a lot of time to understand it.
I do some of the same with ChatGPT. I type the code in myself after trying to internalize the ChatGPT response. But even that is starting to feel like company time-theft, as the attitude is shifting even further away from "knowing how to do things is good" toward "getting shit done is all that matters."
> You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise
Since a big majority of companies stopped hiring juniors, where is the new blood coming from when the inevitable more seniors retire?
I think the important thing here is you're being honest about how you're feeling. You bring up a very real anxiety and possibility and even folks who are using LLMs probably feel some degree of this alienation. That LLMs are yet another tool to push us to move as fast as possible rather than have our brains get into the nooks and crannies of hard problems that may take longer but are more rewarding to us.
But again, you're being honest. The problem with a lot of the AI skeptic arguments I see is a lack of this honesty. Others have noted that there are a lot of contradictory skeptical arguments, and I suspect the contradictions come because the authors have negative emotions about AI which they're using to create negative arguments.
I do fall into this category of people that are seen as heavily abusing copium. I can admit that when I do get unsatisfactory results from a prompt session, a lot of it has to do with the mental friction I feel at the idea of letting something else write my code.
It again is coming back to my opinion the LLMs have recreated the job in such a way that it emphasizes what I disliked most, and de-emphasizing what I liked. It emphasizes "the goal" and de-emphasizes "the process". We had a period in the 10's where the process (namely, becoming adept at using and learning an ever-changing set of open source tools) was a bit more celebrated. You could justify a lunch-and-learn on things like man pages, commit isolation levels, or package manager - and doing something like that would be seen in a positive light. And now, why would you waste everyone's time talking about something that ChatGPT can figure out for you?
At first I didn’t think LLMs were that impressive, but ever since I started using them, I’ve pretty much stopped writing for loops by hand.
It’s not that I forgot how. I just don’t feel like doing it anymore. The time it saves me on boilerplate lets me focus on other things.
Sometimes the code isn’t perfect and I need to fix it, but it’s still way easier than starting from scratch.
Unrelated to your friends, but a big part of learning is to do tedious tasks. Maybe once you master a topic LLMs can be better, but for many folks out there, using LLMs as a shortcut can impede learning.
"Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?" Good point! Treat AI generated code as if somebody else had written it. It will need the same review, testing and refactoring as that.
This blog post makes me actively want to use something besides fly. Putting aside the massive condescension in the article…
… most of the code you write is tedious boilerplate? What do you do all day? I can honestly say the most tedious code I write is migrations, because I have to edit three files. I do this like once a week. It would take way longer to explain to an ai what to do than just do it myself.
… you run the ai on your procrastination and as a result do heads down work? Seriously? 1, programming with AI is like playing a slot machine. It keeps you hooked just on the verge of winning the jackpot for far longer than the tedious task would have distracted you. And let’s say a good agentic case - it runs for 30 minutes before it’s done. That’s not even enough time to “boot up” a serious problem, and your already having to put it aside to go check the work you yourself are saying is mediocre? And you’re advocating to run multiple of these? Even the most ardent vibe covers I know talk about babysitting the agents. How is this unlocking deep work time in any way shape or form?
… craft is relagated to hobbies? I guarantee you, this person loves Linear. It was probably written on a MacBook. This whole “we solve real problems” BS is just a cop out because being great is hard.
… mediocre code is good code? All code is a liability. If you generate 10 times the code, and are admitting it’s the lowest bar of quality possible, it is at a minimum 10x the liability on your shoulders.
… letting LLMs refactor your tests? Serious question, who actually does this? Because I would LOVE to have LLMs write tests for me. I have tried every. single. time. I need to write something that needs a lot of tests to ask LLMs for tests for TWO YEARS, and not once have I ever used a single test it generates. It generates worse tests than it does code, and the last thing I want is 42 asserts in 8 tests across 600 LoC that nobody understands that poorly tests what should have been 2 asserts in 2 tests in 12 lines of code.
I think he's pretty much right, but it just leaves me managing an AI and hopefully having fewer offshore developers and faster turnaround. So something less than a 50% boost, and kind of variable, but still significant. I don't really think it's world shattering because most software is deeply constrained by business processes (deciding what to build, etc.) that this doesn't help.
It depends on what level of nuts you mean. Some are AGI skeptics about LLMs, theyre probably right, there is likely more breakthroughs required before true AGI. But AGI isn't required to completely disrupt a ton of good, well-paid professions. That is the more worrying scenario. AI is already widening the wealth gap irreparably and with more progress it will only continue.
Is it really? I scraped all HN comments with the words "Rust" and "LLM" from the last 6 months, put it into Gemini and asked it to surface quotes of people saying they had a good time, versus quotes of the opposite. It was about 60% positive and 40% negative.
Does anyone have recent personal experience with this they can talk about?
This sounds very cool - almost science-fiction in parts. I want to know more details about the specific technology, whether it could handle a 30k C++ codebase, how much work you would need to do to make it learn how to run your tests, analyse their output, interface with all your internal software development services etc.
For writing, I do a back-and-forth about what I'm trying to do with Claude before asking it to write anything. Produces much less "that's not exactly what I was looking for..."
I was quite happy and entlohntes when I dropped an HTML table Formates with Emojies into my prompt and told to turn it into a terraform module.
But how do you quality check this part? I am proofreading all these transformations and I am not sure if this is really faster than actually recording a VIM macro, which will do the steps I need.
This article doesn’t address literally THE argument I have against LLM usage: that it will degrade major human skills like reading, writing, and yes, coding
But I’m at least grateful for LLMs making me more aware of how important it is for me to focus on my own skills building. I will never use it because I value having those skills sharp
I use Codex CLI for casual stuff, because of the ergonomics of just popping open another terminal tab.
I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.
I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon. The idea of just brainstorming a bunch of shit --- I can imagine keeping focus and motivation going long enough to just rattle ideas off! --- and then making coffee while it all gets tried, is super appealing to me.
> I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.
You're not concerned about OpenAI or Google stealing your code? I won't use VSCode for that reason, personally, but I do use VSCodium.
> I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon
Bunch of async prompts for the same task? Or are you parallelizing solving different issues and just reviewing in the afternoon?
Try getting your fav LLM to do something useful with PowerCLI. That's the VMware PowerShell thing. It is quite popular but not enough, so that most prompts will end up with bollocks, masquerading as code.
When a LLM has been tamed to the point that I can define and request a PowerCLI programme, I can be fairly sure that AI is a solved thing.
I'll be a happy and ignorant nut as long as these models are trained on publicly scraped code without their creators' consent, and the models fail to cite the licenses and sources of the parts the answer is composed from.
Oh, GH Copilot was adding citing capabilities which was going to be in production around this time? Any progress on that front?
Yeah, this is where I went from being on the fence to just checking out entirely. Even if it did mean that the top earners make slightly less (which I'm dubious about) and I happened to be one of them (which I'm not confident in, but I could comfortably survive with a lower salary than I've actually received most of the time), I'd still be in favor of tech workers unionizing because the benefits outweigh the costs both personally and for the larger group.
I've been hired to a fully distributed team only to have the company decide to force everyone to "return" to the office within a couple years despite the team never having been in any office in the first place; I've had managers promise me raises that don't appear, only for me to find out later that they never actually talked to their superiors about it like they claimed; I've seen teammates get hired and laid off six months later the same week that we have someone else we just hired join the team. None of this stuff is reasonable, and for some reason we collectively just throw up our hands and say there's nothing we can do, and then apparently some of us scoff and look down at lower-paid workers banding together to try to get a semblance of the advantages we take for granted in our industry because they're "stopping progress"? Even if I did accept the premise that workers collectively organizing to win rights affected "progress", there's no reason to assume there's no middle ground between the extremes of "no progress" and "full-throttle progress without regard for consequences". It's farcical.
Adding fuel to the fire. I'm extremely disappointed to see such an inflammatory article on fly.io. I wouldn't want this on my personal blog, let alone on my employers, but I guess they're fine with it.
I've been using Zed and Claude Sonnet 4 (and sometimes trying Opus) heavily over the past weeks. For small edits where I have lots of unit tests, the results were great. So great that they worry me with regards to job security. For exploring a new programming domain it was also somewhat useful. I work a lot with the Typescript compiler API right now, and it has almost no documentation. Since the AI can see into every GitHub repository out there, it's much better, and more efficient, at learning APIs based on code from other folks. On the other hand it means I don't do that job, and I am forced to rely 100% on how the AI presents the Typescript compiler API to me. Are there better methods I could use? Who knows.
Where it's abysmal is code architecture. Sometimes it's almost comical: it adds an if statement to handle one highly specific edge case in a program that only makes sense if it solves the general case. This didn't happen often thought.
The hardest part was to force it to reuse existing code from the same file. My use case is transforming a Typescript AST into a GraphQL AST. The code is one big switch statement with lots of recursive calls. The AI would often add 300 lines of code that duplicate some logic which already exists somewhere else.
In the end I rewrote the whole thing from scratch. At around 900 lines of code the AI was starting to really struggle. When I wanted to take over, I realized that I didn't have the in-depth knowledge to do so. And trying to understand the code the AI had written proved futile.
Ultimately that's on me, I should have been more diligent reviewing the dozens of 300 line of code changes the AI throws at me over the course of a day. But I wasn't, because reviewing is really, really hard. For many reasons. And AI makes it even harder.
Am I therefore nuts? I find this whole article extremely one sided. Surely, based on the sheer amount of both positive and negative press, the answer is somewhere in the middle.
>Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields.
What a strange disclaimer. Image generation was a (career)killer feature long before vibe coding was halfway useful.
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. ...
Is this what people are really doing? Who is just turning AI loose to modify things as it sees fit? If I'm not directing the work, how does it even know what to do?
I've been subjected to forced LLM integration from management, and there are no "Agents" anywhere that I've seen.
I cut several paragraphs from this explaining how agents work, which I wrote anticipating this exact comment. I'm very happy to have brought you to this moment of understanding --- it's a big one. The answer is "yes, that's exactly what people are doing": "turning LLMs loose" (really, giving them some fixed number of tool calls, some of which might require human approval) to do stuff on real systems. This is exactly what Cursor is about.
I think it's really hard to undersell how important agents are.
We have an intuition for LLMs as a function blob -> blob (really, token -> token, but whatever), and the limitations of such a function, ping-ponging around in its own state space, like a billion monkeys writing plays.
But you can also get go blob -> json, and json -> tool-call -> blob. The json->tool interaction isn't stochastic; it's simple systems code (the LLM could indeed screw up the JSON, since that process is stochastic --- but it doesn't matter, because the agent isn't stochastic and won't accept it, and the LLM will just do it over). The json->tool-call->blob process is entirely fixed system code --- and simple code, at that.
Doing this grounds the code generation process. It has a directed stochastic structure, and a closed loop.
I'm sorry but this doesn't explain anything. Whatever it is you have in your mind, I'm afraid it's not coming across on the page. There is zero chance that I'm going to let an AI start running arbitrary commands on my PC, let alone anything that resembles a commit.
Some people are, and some people are not. This is where some of the disconnect is coming from.
> Who is just turning AI loose to modify things as it sees fit?
In the advent of source control, why not? If it does something egregiously wrong, you can throw it away easily and get back to a previous state with ease.
> If I'm not directing the work, how does it even know what to do?
You're directing the work, but at a higher level of abstraction.
I use Cursor by asking it exactly what I want and how I want it. By default, Cursor has access to the files I open, and it can reference other files using grep or by running specific commands. It can edit files.
It performs well in a fairly large codebase, mainly because I don’t let it write everything. I carefully designed the architecture and chose the patterns I wanted to follow. I also wrote a significant portion of the initial codebase myself and created detailed style guides for my teammates.
As a result, Cursor (or you can say models you selecting because cursor is just a router for commercial models) handles small, focused tasks quite well. I also review every piece of code it generates. It's particularly good at writing tests, which saves me time.
I run Cursor in a mode that starts up shell processes, runs linters, tests etc on its own, updates multiple files, runs the linter and tests again, fixes failures, and so on. It auto stops at 20 iterations through the feedback loop.
This example seems to keep coming up. Why do you need an AI to run linters? I have found that linters actually add very little value to an experience programmer, and actually get in the way when I am in the middle of active development. I have to say I'm having a hard time visualizing the amazing revolution that is alluded to by the author.
I let an agent upgrade some old C code that wouldn’t compile and had 100’s of warnings. It was running builds on its own, looking at new errors, etc. It even wrote some tests! I could’ve done this myself but it was a hobby project and tedious work. I was impressed.
you are giving it instructions but it's running a while loop with a list of tools and it can poke around in your code base until it thinks it's done whatever you ask for.
See Claude Code, windsurf, amp, Kilcode, roo, etc.
I might describe a change I need to have made and then it does it and then I might say "Now the tests are failing. Can you fix them?" and so on.
Sometimes it works very great. sometimes you find yourself arguing with the computer.
This is also where I am, and I guess it has been a source of mild and growing consternation since I first blagged an OpenAI GPT account when they were private, in an attempt to get ahead of what was coming -- both the positive and negative sides of the advances. Most people either ignored the advances, or quickly identified and connected to the negative side, and effectively filtered out the rest.
As somebody who comes from a politically left family, and was also around in the early days of the Web, let me tentatively note that this issue has a particular political slant, too. The left has strong roots in being able to effectively critique new developments, economic and social, that don't come from its own engines of innovation which have revolved around those critiques, plus solidarity, organization, and sociopolitical action.
The movement's theorists work far more slowly on how to integrate the effect of those changes into its vision. That means when something like this comes along, the left's cultural norms err on the side of critique. Which is fine, but it makes any other expression both hard to convey, and instantly suspect in those communities. I saw this in the early Web, where from a small group of early adopters of all political slants, it was the independents, heterodox leftists, and the right, -- and most vocally, the libertarians -- who were able to most quickly adapt to and adopt the new technology. Academic leftists, and those who were inspired by them took a lot longer to accomodate the Net into their theses (beyond disregarding or rejecting it) and even longer to devise practical uses for it.
It wasn't that long, I should say -- a matter of months or years, and any latent objections were quickly swamped by younger voices who were familiar with the power of the Net; but from my point of view it seriously set back that movement in practicality and popularity during the 80s and 90s.
I see the same with AI: the left has attracted a large generational of support across the world from providing an emotionally resonant and practical alternative to the status quo many people face. But you quickly lose the mandate of heaven if you fail to do more than just simplistically critique or reject a thing that the average person in the world feels they know better, or feels differently toward, than you do. This is something to consider, even if you still strongly believe yourselves to be correct in the critiques.
That's funny! I mulled it a bit more after reading your piece, and I think some of the comments on lobsters that were considering the whole issue in a wider politicL framing. Thanks for replying!
What is the current meta on a fully agentic workflow as he describes it? Wake up, write a vague prompt, go for coffee, and get a PR to review on your phone? I'm genuinely interested to see what going all in on this looks like.
I've yet to see an LLM produce anything but disastrous bugs. Remember they don't know they're writing code that can destroy companies and livelihoods. They are just spitting out plausible but always wrong text.
I tried Copilot many times and found it lacking. I like discussing concepts (maybe with pseudocode) with Gemini. I find this useful but not game changing.
So genuine question for AI Boosters, what should I be using?
I would like the 10x productivity gains we were promised :)
Someone at work said something to the effect of "Oh, this would be great for an intern but we don't have one this year" to which I responded "sure you do, its name is Copilot".
> You’ll spend 5-10 minutes knocking it back into your own style
You lost me here. I have often found it to be far more than a 10-minute style issue, but fundamental misunderstanding of the code purposes that I need to fix.
One thing that really bothered me that the author glossed over (perhaps they don't care, given the tone of the article) is where they said:
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic.
But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.
We're robbing ourselves of good future developers, because we aren't even thinking about the fact that the junior devs are actively learning from the small tasks we give them.
Will AI completely replace devs before we all retire? Maybe. Maybe not.
But long before that, the future coders who aren't being hired and trained because a senior dev doesn't understand that the junior devs become senior devs (and that's an important pipeline) and would rather pay $20/month for an LLM, are going to become a major loss/ brain drain domestically.
I think what is going to happen is that junior devs will develop a strong reliance on AI tools to be able to do anything. I cynically think this was OpenAI’s aim when they made ChatGPT free for students.
I had a rather depressing experience this semester in my office hours with two students who had painted themselves in a corner with code that was clearly generated. They came to me for help, but were incapable of explaining why they had written what was on their screens. I decided to find where they had lost the thread of the class and discovered that they were essentially unable to write a helloworld program. In other words, they lost the thread on day one. Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
From one perspective I understand the business case for pushing these technologies. But from another perspective, the long term health of the profession, it’s pretty shortsighted. Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,” and maybe that will leave me with the group that really wants to be there. In the meantime, I will remind students that there is a difference between programming and computer science and that you really need a strong grasp of the latter to be an effective coder. Especially if you use AI tools.
> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”
I see this so much. “Data science major” became the 2020s version of law school. It’s such a double edged sword. It’s led to a huge increase in enrollment and the creation of multiple professional masters programs, so the college loves us. We hire every year and there’s always money for just about anything. On the other hand, class sizes are huge, which is not fun, and worse a large fraction of the students appear to have minimal intrinsic interest in coding or analyzing data. They’re there because it’s where the jobs are. I totally get that, in some sense college has always been that way, but it does make me look back fondly on the days when classes were 1/4 as big and filled with people who were genuinely interested in the subject.
Unfortunately I think I may get my wish. AI is going to eliminate a lot of those jobs and so the future of our field looks a bit bleak. Worse, it’s the very students who are going to become redundant the quickest that are the least willing to learn. I’d be happy to teach them basic analysis and coding skills, but they are dead set on punching everything into ChatGPT.
> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
This is nothing new. In a computer graphics class I took over 20 years ago, the median score on the assignments before the midterm was >100% (thanks to bonus questions), yet in midterm prep other students in the class were demonstrating that they didn't even have a firm grasp on the basic concept of a matrix.
That is: they were in a 4th year undergrad course, while doubting material from a senior year high school course where they had to have gotten high marks in order to get into the program.
And the midterm grading was heavily curved as a result (though not as much as in some other courses I took).
Students will do what they need to do for the grade. It seems a great many of them have internalized that none of this is about actually learning anything, even if they would never say so aloud. (I learned things - where I didn't already know them - because it was actually interesting. My resulting grades were pretty good overall, but certainly not top of class.)
> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”
Why would it? It's becoming easier than ever to fake understanding, and to choose anything else they would need both the opportunity and social permission. I only see the problem getting worse.
> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
From a student's perspective: I think it was the same with SO. While LLMs make c&p even easier, they also have the upside of lowering the bar on more complex topics/projects. Nowadays, the average person doesn't touch assembly, but we still had a course where we used it and learned its principles. Software engineering courses will follow suit.
Hard capitalism doesn't care about long term perspectives, the only yard stick is current performance and stock maximization. Otherwise US would a bastion of stellar public education for example, the investment in long term future of whole nation instead of few richest ones sending their kids to private schools, to stay above the rest.
So while I fully agree with you, this is not a concern for a single decision maker in private company world. And state such as US doesn't pick up this work instead, quietly agreeing with this situation.
Well, think for a second who makes similar budget and long term spending focus. Rich lawyers who chose to become much more rich politicians, rarely somebody else and almost never any more moral profession.
It's a bit misleading to compare $20/month with an actual human person. The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.
Cursor is a heck of a lot more than $20/month if you actually want it working for a full work day, every day.
> The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.
This issue manifests a bit differently in people, but I've definitely worked with people (not only juniors) who only have a few productive hours a month in them. And for what it's worth, some of those people were sufficiently productive in those few hours that it was rational for the company to keep them.
Further, Cursor might cost $20/month today, but to what degree is that subsidized by VC investment? All the information we have points to frontier models just not being profitable to run at those types of prices, and those investors are going to want a return at some point.
the market will indeed balance this out. remember when a taxi was $20 and an uber $5? now an uber is $25. nobody is going to go back to humans with all their wet meat sack problems, we will get more value for it, but it aint gona stay $5 if those putting up all this capital have anything to do with it. then again, we might get cheap, self hostable local copies (unless theyre made illegal for "safety" or some bullshit)
I dunno, with the advances in open source models I could see in a few years having AI workstations that cost $20,000 with 1TB of VRAM so you don’t have to rely on OpenAI or Cursor. The RTX 6000 Pro is only $7500 and has 96GB of VRAM.
This is something that's been simmering in the back of my mind for a while. Using an AI Agent instead of talking to your human colleagues deprives both of you from learning opportunities. There are probably short term gains in many cases, but I fear there will be long term losses over time.
I agree, and think that organizations that figure out how to use AI well in a collaborative way will succeed in the long-term. Developer community is still where the important growth happens.
Is it possible to talk to coworkers? What if you voted for wrong party? Are closeted gay/trans/qeer? Radical femimist?! Or dog atrack survivor, and they really _REALLY_ like dogs!
Talking to colleagues at work is a chore, and huge risk! Not opportunity! At least AI respects my privacy, and will not get my fired!
> But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.
Almost every senior developer I know is spending that time making LLM's more productive and useful instead.
Whatever you think the job is of the senior developer, it will not be "coding".
I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_. Maybe that is through writing code, but maybe it's orchestrating LLM's to write code for you. The important part is solving the problem.
> Almost every senior developer I know is spending that time making LLM's more productive and useful instead.
LLMs may become more productive/ accurate/ useful, but they're not self-tasking or independent.
> I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_.
There is a progression of skill required to master any profession, starting with fundamentals, and progressing and developing until you are an expert/ senior at that profession. How is a senior sw dev supposed to become that without writing code? Just reading LLM code and bugfixing isn't the same level or kind of experience. You're going to have devs who can't code by themselves, and that's a bad place to be in.
There are already too many people in IT using tools that they don't understand the workings of (and thus can't troubleshoot, can't replace, can't customize to their env, etc), and this will just exacerbate that x100.
MMW there is going to be a very bad skill deficit in IT in 20 years, which is going to cause an innovation deficit.
How many senior developers understand the minute, intimate details of the frameworks, libraries, languages they use? How many understand the databases they use? TFA says, many (but not all) don't have to care as long as the product ships. That's exactly how code written by LLMs is meant to be tested and evaluated. And if you set up a good enough build/test environment, TFA argues that you can automate most of the schelp away.
In my experience, senior engineers without the curiosity to occasionally dig deeper into their frameworks are significantly worse engineers. No framework, library, or language is perfect. A big part of your job is understanding how your code will execute. Sometimes that even requires, at a very high level, imagining how that code will compile down to assembly once you strip away all the abstractions.
Eventually you will get a memory leak even in a GCd language. Eventually there will be some incredibly obscure, unreported bug in a library. Eventually you will find an issue in unmaintained code you depend on. Eventually there will be performance problems caused by too many layers of abstraction.
You either need to know, roughly, how your dependencies work by occasionally digging into their code/reading the documentation. Or you need intuition to know how it probably works, but you usually build that intuition by actually writing/reading code.
I imagine it like this. Juniors will be taught by LLMs on some things, but seniors will still be there, they will still assist, pair program, code review, etc. but they will have another party, the LLM, like a smarter calculator.
> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
> The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.
So it starts with a humble strawman, and then the author illustrates how dumb they think their fellow developers really are if they have the make different decisions based on different values.
Most developers I interact with who are what you might call 'AI skeptic', are most concerned with how this tech will be used, especially since its creation is rooted in mass scale disregard for ownership of anything.
> "The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site.... They stand up global-scale piracy networks",
IMO the author is here projecting or something, cause I literally never met someone with this opinion, and I've also been in tech/developer circles for over 20 years. Personal use of pirated content is very different from commercial distribution or making money of share sites. Not everyone's values are so rooted in making money with complete disregard to the impact of their actions in doing so.
I get it, the author wants to keep using LLMs are for people to stop trying to make them feel bad but trying to make a case for how their arguments are just dumb. But the author completely missed the 'why' behind the arguments. In the end, LLMs are a tool. Use them or not is up to the individual. But that doesn't give a blanket social license to use them in any way people, or more importantly, companies want.
>People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. It’s not Yacc output. It’s knowable
okay wtf is this weird jab at yacc supposed to be? I generally prefer to write my own LR parsers because it's really not that difficult but I never felt like yacc code was difficult to read or not "probabilistic".
AFAIK most serious compilers don't use yacc either because they need to be able to parse context-dependent grammars, and also reinventing wheels is often worth the effort when the wheel is 50 years old and forces you to make compromises because it also has to support use-cases you are not interested in. But again, it's not bad, it does the thing it is designed to do and it does save a lot of effort if you're fine dealing with its limitations.
Only serious complaint I have about Yacc is that the basic posix form has frustrating limitations such as not being able to have two separate yacc instances because they'll declare the same symbols, and I'm fairly certain it's not thread-safe either. Bison fixes some of these while still remaining "probabilistic".
Anyways I'd rather use Yacc+Lex (preferably Bison+flex) to generate my parser than have to go read everything your "agent" wrote with a fine-toothed comb to make sure it didn't introduce any crippling memory corruptions or dangerous ROPportunities.
I am in the middle, neither a dogmatic skeptic nor a full-blown prompt engineer, but I lost it when the author compared a junior developer (a human) to a SaaS subscription.
Tells you what you need to know about the AI culture.
If they’re regurgitating what’s been learned, is there a risk of copyright/IP issues from whomever had the code used for training? Last time I checked, there’s a whole lotta lawyers in the us who’d like the business.
Most of the big AI vendors have liability policies in place where their lawyers are committed to step in and help you out if someone sues you for copyright infringement based on code their LLMs wrote for you. I think the term they use is "copyright shield".
Which AI agent is actually good nowadays? I've tried cursor but i didn't want a fork replacing vscode just to be able to call an LLM. And I'd preferably want pay as you go, not a subscription model
That was a stellar read. I've had (at least parts of) many of these thoughts floating around my head over the past few weeks / months, but it'd have been beyond my ken to write them down as lucidly.
Seems to me that AI itself will take some jobs but the cheap workable software the AI creates will take even more jobs, especially if the AI can create workable software for automation robots.
The LLM coding agents might help you fire some swe to save a penny or two, but soon enough, it will also help your customers or suppliers to replace your business model.
Whilst generative AI is great for writing business letters and essays and all that, going straight for coding, one of, if not the hardest problem is a bit much.
This is a come back in 5 years when it works sort of thing.
This isn't about the post itself, but can you please put the author's name at the top? It's written in first person and we have no idea who's talking until the end of the post.
To anyone else who was going to protest that the name _is_ at the top, apparently it depends on one's browser window width and zoom level. I was a bit surprised to find that increasing the zoom level even affects the order when reading with a screen reader.
Sometimes AI model generates technical debt. It happens for instance if AI generates code with dependancies that to old to be automatically be updated to the latest version.
Ironically all nuts friends seem to be talking less, less defensive than other nuts friends who keep talking about other people not buying what they believe nuts.
"You don’t believe in IPR. Then shut the fuck up about IPR."
As a software engineer, I need to call out this rhetoric. We are not all morally bankrupt. Many of the people with the most ethical backbone I know are also software engineers.
I don't use "AI" for the same reason I don't use amphetamine. I'm sure I could get stuff more stuff done faster if I used it, but I don't, because the manufacturing and the distribution of it is completely unethical.
The blatant stealing of intellectual property is only the tip of the shitberg. As bad are the giant ecological footprint[1], the colonialism of the grossly underpaid RLHF, and the conceding of even more of our digital lives to trillion dollar corporations.
We can and must ethically evaluate our work, and that includes our tools. Nobody's perfect, but doing one immoral thing does not mean you get a free pass to skip morality altogether.
Q: but the code that AI generates is incoherent gibberish
Guess what, life’s tough. Fixing incoherent gibberish is part of the job you wuss. Honestly, I bet you write incoherent gibberish. Would you rather be fixing incoherent gibberish that an AI wrote or incoherent gibberish that you wrote?
Q: but I have to re-do everything it touches. Everything takes me ten times as long!
What are you, some kind of vibe-coding YouTuber? Everything taking ten times as long is part of the job, cupcakes. What’s important is that now it takes ten times as long automatically. I like the way Zed does it, where it tells you to leave while it’s changing stuff.
I can’t wait for the day when people no longer manually write text messages to each other, but instead just ask LLMs to read and respond from a few prompted words.
this is one of those fascinating cases where i agree with none of the arguments, but vehemently agree with the conclusion… it ordinarily would give me pause, but in this case i am reminded that nonsense arguments are equally applicable to both sides of the debate. if the arguments actually had logical connection to the conclusion, and i disliked the arguments but liked the conclusion, that would be real cause for introspection.
I develop space-borne systems, so I can't use the best LLM's for ITAR/etc reasons, but this article really makes me feel like I'm missing out. This line in particular makes me wonder if my skills are becoming obsolete for general private industry:
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. They also:
Every once in a while I see someone on X posting how they have 10 agents running at once building their code base, and I wonder if in 3 years most private industry coders will just be attending meetings to discuss what their agents have been working on, while people working on DoD contracts will be typing things into vim like a fool
> while people working on DoD contracts will be typing things into vim like a fool
Forget LLMs, try getting Pandas approved. Heck I was told by some AF engineers they were banned from opening Chrome Dev Tools by their security office.
FWIW I think the LLM situation is changing quite fast and they're appearing in some of our contracts. Azure-provided ones, of course.
Frankly, as someone who is engaged in fields where LLMs can be used heavily.
I would stay in any high danger/high precision/high regulation role.
The speed at which LLM stuff is progressing is insane, what is cutting edge today wasn't available 6 months ago.
Keep up as a side hobby if you wish, I would definitely recommend that, but I just have to imagine that in 2 years a turnkey github project will get you pretty much all the way there.
Idk, that's my feeling fwiw.
I love LLMs but I'm much less confident that people and regulation will keep up with this new world in a way that benefits the very people who created the content that LLMs are built on.
The best I can offer skeptics is the more you work with the tools the more productive you become. Because yes the tools are imperfect.
If you've had a dog you know that "dog training" classes are actually owner training.
Same with AI tools. I see big gains for people who spend time to train themselves to work within the limitations. When the next generation of tools come out they can adapt quickly.
If this sounds tedious, thats becuase it is tedious. I spent many long weekends wrestling with tools silently wrecking my entire codebase, etc. And that's what I had to do to get the productivity improvements I have now.
Call me old but I am not using AI for coding. It's not what I expect or want an AI to do for me. I would love for it to do stuff I don't like to do. It's like the meme that was floating around: that a woman says it would have loved the AI to do the dishes so she can to art; not the other way around.
Yes I am also a skeptic but that is not weird as I have seen to many movies about AI taking over the world. I hope I never have to say: "I told you so".
I really do see positive points to AI but just not in coding. I would not even trust them to write tests. Might make them write documentation at some point.
I would like to point out: LLM's look really intelligent until you converse with them in a less popular language, it will make weird mistakes and interpret your questions all wrong. I am Dutch and they really suck at that.
Soon all coding will look like L3 support - debugging something you've never seen before, and under pressure. AI is really taking away the fun parts from everything and leaving just the drudgery in place.
The key is to figure out how to move up the ladder of abstraction. You don't want to be a "coder" in a world where AI can code, but you do want to be a "person who makes software" in a world where making software just got easier.
This feels different; I asked DeepSeek R1 to give me an autoregressive image generation code in pytorch and it did a marvelous job. Similar for making a pytorch model for a talking lip-synced face; those two would take me weeks to do, AI did it in a few minutes.
Autoregressive LLMs still have some major issues like over-dependency on the first few generated tokens and the problems with commutative reasoning due to one-sided masked attention but those issues are slowly getting fixed.
> I think this is going to get clearer over the next year. The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality. I’m snarking about these people, but I meant what I said: they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.
Yes, yes, and yes. SOOOO much yes. We are going to be in for a wild ride once everyone gets on board with the reality of coding agents.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
The cost is a pretty obvious indication that it's nowhere near the capability of an engineer. A robotic worker that can produce code 24/7 without tiring would be worth even more than a person of equivalent skill. I don't know what kinds of goofy hiring ya'll are doing, but all of the juniors I've been involved in hiring actually worked out great.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
I was willing to read until I got to this part. He can shove this drivel up his own ass for all I care if that's how he thinks. What a waste of time.
> but the code is shitty, like that of a junior developer
> Does an intern cost $20/month?
> Because that’s what Cursor.ai costs.
So sick of hearing comparisons between LLMs and Juniors, first of all are we really are seniors? After almost 20 years sometimes I still think I know nothing. Second, when I was a Junior just me being younger, fresher and with an eager learner attidute brought a net positive impact on the team I joined back then, that I still hang out with those people to this day and we still have fun joining new projects together. I'm so sick of these snobbish "seniors" that keep throwing rocks at younger and well prepared people
People like to think that intelligence is some intrinsic force but it's far better thought of as situational. We are contextually smarter or dumber based on the emotions we're trying to process from a situation and how our motivation reasoning allows us to avoid certain negative emotions.
The subset of people the OP is trying to argue against are people who are otherwise normally intelligent but have an emotional need for AI to be unimportant that they're unable to process so they build a social community of terrible arguments.
The nature of the internet is that visibility is terribly correlated (and often anti-correlated) with popularity so OP thinks this is some large contingent of people when it's instead algorithmically served conflict generators.
Writing against that is also an algorithmic conflict generator, as evidenced by the number of upvotes here. The correct response is to realize that people putting up terrible arguments about AI don't actually care about AI, they're using this issue to process unresolved trauma from other, unrelated areas of their life and taking their ideas literally instead of seriously is a waste and diversion. There are plenty of smart people with far more nuanced views that still disagree with you where they're interested in the content of the argument.
Spot on. Current gen LLMs also has the ability to mimic your code style. I always point it to existing code, and the result is the same as what I would have written.
Yesterday I “wrote” 18k lines of code with Claude & Cursor. Mostly UI, some backend. This would have taken me a week without LLMs.
I don't get the focus on the coding. In all the projects I worked on for 25 years, coding has never been the bottleneck in projects. Maybe the projects I work on are simple, maybe I was lucky, but still: once the requirements are clear, coding is easy. And it's literally what takes the least time. I don't want to spend less time on coding thanks to AI. Because coding is what I like, not the rest of the job. Bring me AI that will do the meetings for me please.
>Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
Maybe you are, but I didn't get a Master's degree to produce quick shit for shareholders
basically nails every point i've wanted to say but couldn't lol. also in the end the endless posturing gets tired and we go back to getting shit done. either it helps you are it doesn't.
"they’re smarter than me" feels like false humility and an attempt to make the medicine go down better.
1. Thomas is obviously very smart.
2. To be what we think of as "smart" is to be in touch with reality, which includes testing AI systems for yourself and recognizing their incredible power.
i feel like surprisingly, front end work which used to be viewed by programmers as "easier" is now more difficult of the two, because it's where LLMs suck the most
you get a link to a figma design and you have to use your eyes and common sense to cobble together tailwind classes, ensure responsiveness, accessibility, try out your components to make sure they're not janky, test out on a physical mobile device, align margins, padding, truncation, wrapping, async loading states, blah blah you get it
LLMs still suck at all that stuff that requires a lot of visual feedback, after all, you're making an interface for humans to use, and you're a human
in contrast, when i'm working on a backend ticket ai feels so much more straightforward and useful
My take is: It is OK to don't buy into the hype. There's a lot of hype, no denying that.
But if you're actively avoiding everything related to it, you might find yourself in a position where you're suddenly being left in the dust. Maybe not now, not next month, not next year, but who some time in the future. The models really are improving fast!
I've talked with devs that (claim they) haven't touched a model since ChatGPT was released - because it didn't live up to their expectations, and they just concluded it was a big nothingburger.
Even though I don't follow the development religiously anymore, I do try to get acquainted with new releases every 3 months or so.
I hate the term "vibe coding", but I personally know non-tech people that have vibe coded products / apps, shipped them, and make more money in sales than what most "legit" coders are making. These would be the same "idea people" that previously were looking for a coder to do all the heavy lifting. Something is changing, that's for sure.
The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.
It’s not like “becoming skilled and knowledgeable in a language” which took time. Even if you’re theoretically being left behind, you can be back at the front of the pack again in a day or so. So why bother investing more than a little bit every few months?
> The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.
Very much disagree with that. Getting productive and competent with LLM tooling takes months. I've been deeply invested in this world for a couple of years now and I still feel like I'm only scraping the surface of what's possible with these tools.
I think the more "general" (and competent) AI gets, the less being an early adopter _should_ matter. In fact, early adopters would in theory have to suffer through more hallucinations and poor output than late adopters.
Here, the early bird gets the worm with 9 fingered hands, the late bird just gets the worm.
It takes deliberate practice to learn how to work with a new tool.
I believe that AI+Coding is no different from this perspective. It usually takes senior engineers a few weeks just to start building an intuition of what is possible and what should be avoided. A few weeks more to adjust the mindset and properly integrate suitable tools into the workflow.
Another one, here we go :) I think I'm going to keep it short because I am not here to convince people they're wrong, just to voice my dissent from uh almost every viewpoint this article comes from.
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans.
Well that's just wrong - we have failed to capture tremendous value through knowledge of our craft and to reduce it down to min-maxing effort/profit is not a pursuit that I can get behind. Nobody is talking about building Ivory towers, but we can do a lot better than most of the common denominator technologies in use today. We're failing our users most of the time (and of course people like the author say it doesn't matter because it doesn't show up in the quarterly results).
> Nobody cares if the logic board traces are pleasingly routed.
Both off and on topic: I got access to Claude’s voice mode today, and for my second test I asked it for its thoughts about both the linked blog post and the HN discussion here. The full transcript is at [1]. Some parts that stood out (slightly edited):
Me: So what is your assessment of the discussion at Hacker News?
Claude: The Hacker News discussion strikes me as much more grounded than the blog post itself. What I find compelling is that many commenters are sharing specific technical experiences rather than making broad ideological arguments. When someone describes spending hours fighting with Claude agents that keep refactoring code into spaghetti, that's concrete data about the current state of these tools. ...
Me: Well, what about you, Claude? You are made by a company whose president has recently made some public statements saying that, you know, 50% of white collar jobs will be replaced soon. And the company, Anthropic, also is very aggressively developing and selling tools to help with AI coding. What are your thoughts about that?
Claude: That's a really fair question to ask me directly. I think there's an obvious tension in my position here since I'm built by a company that has clear financial incentives around AI adoption, including coding tools. When I look at my own capabilities honestly, I can see both sides of this. I'm genuinely useful for certain coding tasks like explaining unfamiliar code, generating boilerplate, or helping debug specific problems. I've helped people learn new programming languages faster than they might have otherwise. But I'm also clearly not at the level where I could replace an experienced software engineer working on complex system design or novel problem solving. ...
Thank you for sharing. I'm looking forward to getting voice mode, but after doing some searching I saw that it still needed the user to hit the "send" button after they are done talking. Did I misread, or was your experience that it would reply to you after you finished talking without the need to press anything?
When you’re done talking, you tap anywhere on the screen to indicate that you’re finished talking, and Claude replies a second or two later. If you are silent for several seconds, it goes ahead and replies without waiting for you to tap. When Claude is talking you can tap on the screen to interrupt; that is actually quite useful, as it can be a bit long-winded.
It’s not quite as natural a back-and-forth as with ChatGPT’s or Gemini’s voice mode on mobile, but maybe that’s for the best.
I tried a test later in which I gave it a research paper—Sakana AI’s latest—and asked it to explain it to me. When it used a term I didn’t understand, I just tapped on the screen and asked for an explanation. That worked quite well.
> First, we need to get on the same page. If you're [6 months out of date with the technology you think of as overhyped after having observed it for the last 4 years or so], you're [not fairly evaluating it].
Perhaps, but this is an extremely unconvincing approach to the argument right out of the gate - it's basically on the same level as "communism has never been tried".
Especially if you're going to start out by describing your interlocutors as "crazy", rather than positing that an effectively completely new thing now exists that ought to be evaluated differently from Copilot et. al.
Why would anyone rather read and fix someone else code rather than writing the code themselves? I do a lot of code review for other human code and it use so much more energy than writing my own code (and surely, as I have competent colleagues, this is not even as bad as if I expected that the code that I'm reading could be totally random shit)
> Why would anyone rather read and fix someone else code rather than writing the code themselves?
Because their own review standards are low (so they find reviewing "easy"), and/or because they can't appreciate the emotional & mental fulfillment that coding provides.
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
I think it comes all down to that, do you have pride in what you do or you don’t ?
I make a wall with bricks, even if it will be covered with coating i will do my best to have regular joints and pacing.
Could make it faster, no one would notice the difference but me… i hate that feeling when you done something and you know it’s barely enough, just barely, it’s kind of shit and you really don’t want others to see it.
On the opposite side, some people will take pride in building wall twice as fast as me and won’t care it’s horrendous.
Both cases are valid, but me i know i can’t do a work I’m not proud of.
Totally agree with you that pride is important, but there's definitely more. Being a good engineer is frequently a problem of design --- whether it's user experience or code abstractions. Design is partly art, and that makes us artisans at least some of the time. The code that I've written that has endured the longest --- decades --- has been code that was designed well.
I'd be interested in the open source repositories that the author of this article created a) before "AI" and b) after "AI". I mean significant work mostly done by him.
I wasn't familiar with these agents yet. Does anyone know what the best ones are? Is there a good comparison of the options and their strengths and weaknesses?
Under the heading: "but you have no idea what the [LLM's] code is."
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code.
> People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. [...] The LLM might be stochastic. But the LLM doesn’t matter. What matters is whether you can make sense of the result
> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates [...] how are you handling the chaos human developers turn out on a deadline?
> You’ll only notice this happening if you watch the chain of thought log your agent generates. Don’t.
"You're nuts!" says the guy with his head intentionally buried in the sand. Also way to tell me your business model is a joke without telling me your business model is a joke. Enjoy it while it lasts.
Man the redbull is oozing off this post, talk about sipping rocket fuel.
I mean a tool is a tool, nothing wrong with that - but most of the resistence stems from AI being shoved down our throats at warp speed. Its already everywhere and I can't opt out, that stinks.
As for the skepticism in terms of adoption and usefulness, its mainly a question of whether or not it will continue improving - there's no way to no what lies ahead, but if it came to a grinding halt today well then the high water mark just isn't all that impressive.
> Yeah, we get it. You don’t believe in IPR. Then shut the fuck up about IPR. Reap the whirlwind.
This is the point that matters, and I don't think everyone is on the same page that LLMs are essentially over glorified data laundering.
The industry would get just as much "value" if we declared a jubilee and wiped out all licenses and allowed unlimited plagiarism (Looking at Zuckerburg and his 10 TB of pirated data). In fact, if AI owners published their training data sets with a capable search engine, I would bet money of it out performing LLMs in most cases. Why waste all that man power reinventing Netflix again? Just copy paste the code and give everyone their time back, sheesh.
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
I'm in a role that is behind the times, using a bespoke in-house framework that is immune to the benefits of LLMs, so I don't get to see what you see - so as a skeptic, I'm not convinced this isn't just the illusion of speed. I have not seen convincing results, show me the amazing things being made by AI (AI tooling itself does not count) - but yes, maybe that's because its all siloed into walled gardens.
> But something real is happening. My smartest friends are blowing it off. Maybe I persuade you. Probably I don’t. But we need to be done making space for bad arguments.
Yeah all the arguments have been made, good and bad, we're all waiting to see how it plays out. But I'd rather take the side of being a skeptic - if I'm right then I'm in the right place. If I'm wrong, that's cool too, I don't mind playing catch-up. But fully embracing the hype is, IMO, tantamount to putting all your eggs in one basket, seems like a needless risk but if that's worth it to you to get ahead then by all means, slurp up the hype.
I am just some shmoe, but I believe that devs fall into to major categories when it comes to LLMs: those with their own product ideas, and those without their own product ideas.
The prior look upon Claude Code/Cursor/Windsurf much more favorably, as they are able to ship their ideas much faster.
This is a bit of a hot take, so I would love any replies to bring me back down to earth.
This is a great article and I think I agree with most of it, not being an agent user myself.
One point the author might be missing is:
> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing
The tech giants and AI bros alike have been bombarding us with "but look what it can do NOW", "the new shiny model is revolutionary, now buy tokens", "look what's around the corner" and whatever else for over 2 years now. Can you really blame a typical "skeptic" for not being interested in putting their time to veryfing these claims themselves after seeing mediocre results so many times? You start to develop a filter for these things.
I feel like we get one of these articles that addresses valid AI criticisms with poor arguments every week and at this point I’m ready to write a boilerplate response because I already know what they’re going to say.
Interns don’t cost 20 bucks a month but training users in the specifics of your org is important.
Knowing what is important or pointless comes with understanding the skill set.
I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.
The criticisms I hear are almost always gotchas, and when confronted with the benchmarks they either don’t actually know how they are built or don’t want to contribute to them. They just want to complain or seem like a contrarian from what I can tell.
Are LLMs perfect? Absolutely not. Do we have metrics to tell us how good they are? Yes
I’ve found very few critics that actually understand ML on a deep level. For instance Gary Marcus didn’t know what a test train split was. Unfortunately, rage bait like this makes money
Models are absolutely not improving linearly. They improve logarithmically with size, and we've already just about hit the limits of compute without becoming totally unreasonable from a space/money/power/etc standpoint.
We can use little tricks here and there to try to make them better, but fundamentally they're about as good as they're ever going to get. And none of their shortcomings are growing pains - they're fundamental to the way an LLM operates.
"pretty much every metric we have shows basically linear improvement of these models over time."
They're also trained on random data scraped off the Internet which might include benchmarks, code that looks like them, and AI articles with things like chain of thought. There's been some effort to filter obvious benchmarks but is that enough? I cant know if the AI's are getting smarter on their own or more cheat sheets are in the training data.
Just brainstorming, one thing I came up with is training them on datasets from before the benchmarks or much AI-generated material existed. Keep testing algorithmic improvements on that in addition to models trained on up to date data. That might be a more accurate assessment.
>I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.
Wait, what kind of metric are you talking about? When I did my masters in 2023 SOTA models where trying to push the boundaries by minuscule amounts. And sometimes blatantly changing the way they measure "success" to beat the previous SOTA
Maybe make a video of how you're vibecoding a valuable project in an existing codebase, and how agents are saving you time by running your tools in a loop.
This roughly matches my experience too, but I don't think it applies to this one. It has a few novel things that were new ideas to me and I'm glad I read it.
> I’m ready to write a boilerplate response because I already know what they’re going to say
If you have one that addresses what this one talks about I'd be interested in reading it.
>This roughly matches my experience too, but I don't think it applies to this one.
I'm not so sure. The argument that any good programming language would inherently eliminate the concern for hallucinations seems like a pretty weak argument to me.
What valid AI criticisms? Most criticisms of AI are not very deep nor founded in complexity theoretic arguments, whereas Yann LeCun himself gave an excellent 1 slide explanation of the limits of LLMs. Most AI criticisms are low quality arguments.
Tbf the dismissal of the IP argument is a bit disappointing. Just because you don't respect IP doesn't mean we all do. Some of the code I'm proudest of I'm considering never open sourcing now because I simply don't trust online repositories anymore. Granted I'm in a bit of a different field than most developers but still, open source as a concept seems less attractive in the wake of tools that scrape it and copy it automatically without attribution. Count me out from ever sharing my code ever again.
I used to believe in the stereotypical copyleft anti IP ethos. In the modern day of corporate LLMs and disrespectful scraping, the lay of the ground is different and people understandably should be less inclined to open source their code.
This is also one of my predictions. LLM scrapers have made me add the "no derivatives" descriptor to the default CC license I publish all of my writing and side-projects under (for now, CC considers ND a valid legal defense against LLM scraping).
I still do stuff in the hopes of it benefitting anyone - but not parasitic oligarchs who are responsible for the imminent destruction of the world and society.
I don't know if a reasonable number of spare-time creators, authors, artists etc. feel similar about these things, but if they do, I'd expect a lot more content under non-free licenses in the future, that might've been published openly otherwise.
I think people fail to realize that to value software to $0 is to destroy the point of most of it.
If it becomes so easy to write apps, then everyone can build their own. They won't need your app.
The entire point of most SaaS business revolves around the simple concept/question: build vs. buy.
So if there's no point in buying since building is so easy, then AI isn't helping you like your software product/SaaS.
I think programmers may just become the "IT person" or "webmaster" or "AI guy" at a company that needs software. Not at a separate company building it for others.
This may greatly reduce the need for programmers. This will in turn reduce the need for fancy computers like that expensive MacBook Pro. It will reduce the need for web hosting. Don't get me wrong, these things will still exist...but the demand will decrease.
Essentially it will become the biggest bubble since the dot com bust. It'll be much larger.
The dust will eventually settle and we'll all adapt to a new world with AI...but I think the years and years of over specialization could come to a screeching halt.
The industry will likely implode and suck in all of the orbiting industries along with it. I hope I'm wrong, but I don't think people quite understand here. I greatly suspect there will be far fewer sales of high end computers. Compute power will continue to move to data centers.
Think about it this way. Code aside, because it's not just a coding thing. If everyone can create amazing videos and full on films with AI, then they don't need all those high end machines. They don't need tons of hard drive space that takes all the raw footage and different cuts to make a video. You prompt and the get the exact result you need. Assuming AI gets that good.
So the more we push into AI the greater than "gravity" here so to speak and the stronger the implosion. I just see an entire industry collapsing in on itself personally. If I'm believing all the FOMO logic out there.
If "we'll see our first one person billion dollar company." Whoever said that is a moron. You understand what that means, right? If anyone can do all that, then how could that even exist? Are we all going to become billionaires now? No, to say something as stupid as that is to say software is trivial and worthless. So how could anyone start a software business worth so much?
The issue is maintenance and focus. Sure, smash out an AI accounting app, payroll app, CRM, etc but now you're not in the business of selling hamburgers or whatever, you've become a conglomerate.
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.
This identifies a critical lever: it doesn't matter that AI sometimes makes mistakes, hallucinates, makes poor architecture decisions. All that matters is your PR to master, because that's what you always have been and will continue to be judged on. Merging trash? That's on you! Failing to utilise new tools and taking 10x longer to make the feature? Also on you!
Ah yes, because it was garbage six months ago, all that was thrown away and now they latched on to the "agentic" thing. At best this is pre-alpha-quality software not fit for production use. After all, it has been less than six months. Next it's going to be the post-agentic world, and then all the companies are going to go bankrupt once the bills are actually due. Any company that pivoted to video, err, AI is also going to go bankrupt, while everyone they laid off is going to build the next big thing and they'll live the heads of the AI charlatans on pikes outside of town to warn off any AI stragglers.
> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania
It's not that it's a fad. It's that the hype has gotten way ahead of the capability. CEOs laying off double digit percentages of their workforce because they believe that in 6 months AI will actually be able to do all those jobs and they want to get the message out to Wall St to juice the stock price today.
Both things can be true, and in my view, they are. I think there is a lot of "there" there with these tools, and increasingly so, and also that lots of people are out over their skis with the hype.
The key is to learn the useful tools and techniques while remaining realistic and open-eyed about their limitations.
That's why I said AI is not a fad. It's everywhere and is providing real utility. It's just not going to let CEOs replace the majority of their workforce like they think it will.
things are way beyond hype at this point and calling skeptics who raised the alarms along the way nuts is nuts. I'll also leave this here: https://www.youtube.com/watch?v=6ovuMoW2EGk
people do not believe AI is fad because of NFT mania. we believe LLMs are a fad because they're over-hyped and they under-deliver. just my $0.02. note that i didn't say "AI is Bullshit." but I will say some of the AI hype is bullshit. and one thing i've noticed is there are so many caveats for making accurate statements about LLMs, it's hard to make short statements that aren't wrong (unless you have all those exceptions and caveats.)
> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue!
Do you /intend/ to spend /most/ of your work life reading AI puke? If so: sanity issue! You are advocating for replacing creative action that gives you Flow with reaction that gives you hemorrhoids, as the /bulk of the work/. Fuck right off.
> How are you handling the chaos human developers turn out on a deadline?
By correcting the deadline setting methodology for the next round of development.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs
then rely on Cursor.ai to raise your next-gen seniors who are going to be supposed to review AI puke in the future
> but the craft [...] Do it on your own time
Go choke on a bag of dicks. You are the problem. This is why quality no longer exists in the world, only lowest common denominator consumerist shit. This is why cars, home appliances, and tools have been getting both worse and more user-hostile. Because nobody values the quality of their own creative output anymore. Quality should not be an exception (which makes it very expensive), but the standard. Quality should be a near-commodity, and therefore at least occasionally affordable for anyone.
> I buy a fucking table
A ridiculously low quality, but hey, cheap!, table, that will fail you at the most inopportune moment. IKEA does have its place, but it's probably not in your office where you receive clients. Do you want them to see the table top fly off?
> They almost certainly have a bigger bag of algorithmic tricks than you do: radix tries, topological sorts, graph reductions, and LDPC codes
and you'll have to review that bag of algorithmic tricks embedded in AI puke, with potential hallucinations introducing obscure, critical mathematical bugs. Good luck verifying all that as a senior, based on proofs and first principles. A human coder will either use a well-reviewed library, or cite a textbook, and implement the textbook very closely, so you can follow it, and have some trust in the intent at least.
> No profession has demonstrated more contempt for intellectual property
how disingenuous. you are showing contempt for copyright. You are totally missing the point of copyleft. Copyleft doesn't advocate for anarchy, it advocates for user freedom via the software copyright framework. When AI barfs GPL'd code snippets into a proprietary code base, that harms user freedom.
> and watched it in seconds spot LVM metadata corruption issues
you are drifting; this is not code generation. Fuzzy pattern matching has always been fair game.
Another day another post about AI by people whose job will be made redundant.
Something fascinating about this HN crowd in particular is its lack of foresight. Companies producing medicine have to take into account longterm effects, yet people working in these “tech” companies push out AI after AI with not a single care in how it’s going to impact people.
The more time I spend using AI-assisted programming tools the less worried I am that they're going to take my job.
Not because they aren't effective - they are - but because using them helps me appreciate how important my own skills and experience are for using them to effectively solve problems.
> We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
I'm not a skeptic because they can't produce code or whatever. Sure they are getting better at it.
I'm a skeptic because the megacorps have decide ai slop is the future and its going to push us deeper into hypercapitalist dystopia.
Programming as a career is going to end up as a ai slop janny. You aren't going to be paid more if your job is easier or whatever, you're gonna get paid less (and then end up doing more atomised work) because "an ai can do it". That's the direction that everything seems to be heading.
I find it curious that nobody in this whole thread does once mention AI communism. I think after much pain and sorrow, and many deaths, this is the most likely outcome. Probably in a dystopian fashion.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
> The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.
This is pretty disingenuous and mean-spirited. He's using SV startup types to smear all developers, including the typical, humble, un- or undercompensated FOSS dev who absolutely respects intellectual property, and quite reasonably, expects their own IP to be respected as well. But because Sam Altman types follow the "It's Better to Beg for Forgiveness than to Ask for Permission" rule, altruistic, non-sociopathic developers have no right to complain?
> non-sociopathic developers have no right to complain
The very same developers that advocate pirating from Netflix or Disney?
He's pointing out the hypocrisy in a class of people (generally, tech-savvy programmers) that have no problem advocating for piracy and gleefully thwarting the concrete IP rights of businesses on one hand, while defending and bemoaning the abstract IP rights of other content creators on the other hand.
My take away is that he's saying IP rights is a weak argument to use to defend a personal position for not adopting LLM's in your workflow, especially given the context of modern piracy.
> The very same developers that advocate pirating from Netflix or Disney?
Even the ones advocating or engaging in personal piracy aren't creating unauthorized derivative works and monetizing them, which is arguably what these AI systems and their customers are doing. And Thomas wasn't talking about personal piracy, but instead trying to use the IP-hostile conduct of Google and certain startups (like Scribd or GrooveShark) to smear all developers and de-legitimize their IP-based objections, even of those releasing code under permissive licenses (like the GPL) in exchange for expected adherence to certain terms and conditions that these systems and their users ignore.
This is dead right; my AI skeptic friend are ideologically opposed to LLMs and GenAI.
Their arguments have nothing to do with how good or bad LLMs are:
- IP issues (for code in a jupyter notebook which will never face IP scrutiny)
- because it exploits slaves and poor people (but so does the computer they use and the clothes they wear).
I also have a few programmer friends who make more legitimate complaints about code quality but because they are friends there is also an obvious subtext of ideological opposition.
My opinion is you should use LLMs anyway; you can hate "capitalism" and still take advantage of the system.
>All progress on LLMs could halt today, and LLMs would remain the 2nd most important thing to happen over the course of my career.
During my time as a programmer (gaming + astrophysics), the following things have happened:
1. The rise of GPGPU programming, which has enabled 1000x increases in performance
2. A revolution in CPU architecture, from single core, to massively multicore
3. C++98 - C++26
4. Transformational increases in CPU memory
5. (In the graphics space): dedicated raytracing hardware, the fully general purpose GPU + OpenCL/CUDA, deferred rendering, FSR/DLSS
6. Mobile phones were invented
7. 64-bit
8. The internet went from being for MSN and facebook to what it is today
9. The rise of Valve/Steam and the incredible transformational existence of the indie games market, which enables individuals to self publish their code and products for virtually nothing
10. Clang was born, and GCC came back to life. MSVC became standards compliant, finally
11. Rust was invented, and is taking off. People actually take security seriously now (!)
All of these have had vastly more impact on my programming than LLM's ever had. I've always a bit wondered who people are who find it transformational, because I can vaguely gesture at any part of my programming tooling and find that its vastly different to 10 years ago
>interact with Git, run existing tooling, like linters, formatters, and model checkers, and make essentially arbitrary tool calls (that you set up) through MCP.
I mean.. you can just do all these things anyway. Its literally more work for me to use an LLM to run a linter than to run a linter (its built into my IDE). So's git. And formatters as well. You can also make arbitrary tool calls.. through your ide/scripts/precommit/postcommit/etc. I have no idea why you'd even want an LLM to do this!
>People complain about LLM-generated code being “probabilistic”. No it isn’t
>The LLM might be stochastic
????????? That's literally the entire point. I want deterministic answers, like clang based autocomplete instead of the nightmare that is prob- i mean stochastic autocomplete
>If hallucination matters to you, your programming language has let you down.
They matter beacuse they waste your time with bullshit that you then have to fix. No programming language can truly enforce correct logic constraints which are one of the primary difficulties with writing code. I literally have no idea what the OP is even doing, actually writing code has always been a vastly minimal amount of time - the main bottleneck is the train of thought to make sure that everything's correct. The literal typing is an afterthought. No programming language can bypass that step, they can (at best) handhold you through certain kinds of problems
>Does an intern cost $20/month? Because that’s what Cursor.ai costs.
For the moment. They'll jack up the prices while enshittifying, and then good luck. I do not want to subscribe to a service to be able to code. I use free IDEs and free tools for exactly this reason. If you rely on a paid-for 3rd party you are doing it wrong and will regret it. This is one of the silliest things in the whole article
>But people select languages in part based on how well LLMs work with them, so Rust people should get on that
They really don't. People select languages based on:
1. What they're taught at school/university
2. What language meets their domain requirements
3. What language their job requires
>Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Are you joking. Is this article a troll? Programmers give a lot of a crap about copyright law. GPL/MIT/etc are hugely important, and people respect the ever loving crap about it in general - even most major companies won't just blatantly pirate GPL works
There's a tonne more in this article, but it smells like someone who has literally no idea what they're talking about talking out of their arse, and it shows profoundly
I code to understand. That's it. That's all. My code is a reflection of a problem definition and a solution. The problem may be under- or poorly defined and the solution may be lousy but by God it's there as a clear reference to be further reasoned about.
Programmers who don't work to deepen their understanding? Nah, sorry, not programmers I want anywhere near the code I'm working on.
To the extent that an LLM can help deepen understanding, good! Use it! Use its code. But you use it according to your intentional, systematic understanding of what the you're doing and why. And that understanding is not a function of code generation or, worse, code churn for the sake of feeling like you're going fast and paying less.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.
I've been writing code for over 30 years and I can't recall ever writing "tedious code". I don't know what this even means. Code, by definition, should not be tedious. At any layer of the stack, code should solve non-tedious problems. The whole point of code is to avoid tedium.
Most code, most projects? I haven't seen this and doubt it's even close to true - if it is God help us. If you write tedious code, find another place to work or another career.
> Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work.
Seriously who kills time by refactoring unit tests? Is this projection by the author?
> But an LLM can be told to go refactor all your unit tests.
Wtaf?? The unit tests are what stands between the blind ambitions of a statistical model and correct code. Handing over tests to an LLM? One staggers.
Tests represent part of that problem definition and solution. They're integral to the code. They foster and deepen understanding - not to be subject to the musings of statistical process, unless those musings serve as a dialectic.
> If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?
Answer: Slow the F down! Chaos and programming don't belong. Managers who drive their teams like this are bad managers.
> If hallucination matters to you, your programming language has let you down.
The author seems to think that programming is about producing code. Producing code that compiles is not a problem. Producing correct and coherent solutions to problems that need to be solved is a problem. Hallucinations aside, how will you know that generated code is something to keep or reject unless you have some understanding of what the code should do before you see it?
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
Squee! Code for all my friends!
> but it’s bad at rust
No, it's bad at understanding. And so are humans, initially. But it's our job to build and deepen understanding and we use code to facilitate that.
> Professional software developers are in the business of solving practical problems for people with code.
Correct.
> [LLMs] devour schlep, and clear a path to the important stuff, where your judgement and values really matter.
This begs the question, who defines what's important? It sounds like the author agrees that it's us, not the model. I agree this is a useful method. It's like any code review - it sharpens the reader no matter the code quality.
> As a mid-late career coder, I’ve come to appreciate mediocrity.
How cool and nonchalant what a rebel he says the opposite of everyone what a baddie!
I think the author means "appreciate naivety", meaning, a first pass that's good enough is good enough - there's no need to go further. I'll concede this, but it's a concession and not something to boast about. Odds are decent you'll come back to this naive implementation and revise it. Yes, at a certain point there's diminishing returns but good programmers know when to invest in revision and when to move on.
> LLMs really might displace many software developers.
We've seen this cycle. Technology and globalization puts pressure on programmers. There are legions of programmers in parts of the world that will generate code for $20 an hour. You can pay $20 and hour - or $20 a month - and generate mountains of code that blows chunks - not because it's ugly but because it's incoherent or because it solves the wrong problem. We. have. seen. this. It's why Silicon Valley still pays massive salaries to good programmers.
> The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality.
Time will tell. In the meantime I'll keep investing in understanding, not code, and see how things shake out.
> It’s projection. People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough! But people select languages in part based on how well LLMs work with them, so Rust people should get on that.
How is it the responsibility of the Rust community that there weren't enough metric tons of free code for the machine to slurp up? And the phrasing makes it sound like it's the community's fault for not feeding OpenAI enough code to be stripped of its license and authorship and get blended into a fine latent soup. It's a lot like people coming to a one-man FOSS project with a laundry list of demands, expecting to be treated with the religious reverence of a major enterprise contract.
The whole tone, the pervasive "use it or you'll be left behind"—where users saying they don't want or need it only proves further evidence of its imminent apotheosis—superficially reminds me of previous FUDs.
And how is it not concerning that the thing described as intelligent needs billions of lines to generalize a language a human can learn from a single manual? Will it need hundreds of kLOC to internalize a new library, or even its new version, beyond in-context learning? The answer is yes; you are choosing to freeze the entire tech stack, when fixing its abstractions could actually save you from boilerplate, just so the machine can write it for you at $200 a month with a significant error rate.
Is the problem with LLMs and Rust really a problem with the training data or a problem with it being harder to get something to compile in Rust? I presume its easier to get something to compile in the dynamically typed languages, and then slightly more difficult in statically typed languages and then a step up with Rust which also forces you to deal with the lifetime of objects. But maybe with LLMs its different and they can do static typing easier than dynamic but they just really struggle with structuring a program to properly deal with ownership.
They mostly copy and mix patterns in the training data. Lots of repetition with variations on them is helpful for their generalization. Languages like Python and Go have a ton of code in Github, etc like that. I saw that using Python with GPT 3.5/4.
If it's a rarer language, the math doesn't do as good of a job on piles of random code. There's just not enough for it to learn from. I cant speak for Rust since I dont know the numbers but imagine it's much less than Python or Go.
I have seen some evidence, though, that harder languages are harder for them to code in. GPT 3.5 used to struggle with C++ for something that it could easily produce in Python. It could actually produce things in C more easily than C++. It makes sense, though, because there's both more context needed for correctness and more behavioral patterns to write it.
My solution, which I only prototyped in GPT due to leaving AI, was to use AI's to write code in languages like Python which non-AI tools transpiled to high-performance code in C++ or Rust. Think the Python to C++ compiler or maybe Nikita. Later, with hallucinations mitigated enough, add LLM's to those transpilers.
As a side benefit, it let's you sell a product accelerating or increasing predictability of applications in that language. That's a non-AI investment. There's a few companies doing that, too. So, one could sell to the AI crowd, the "language X in business" crowd, or both.
Wow, I'm surprised this post hasn't gotten hit by the HN flame-war detector: We have threads downvoted into forced-collapsed that are headed by completely innocuous comments. It really is strange how people here are not able to accept different viewpoints in a discussion: I've three times now returned to the comments to just blanket-upvote everything that's starting to turn grey.
Thankfully the uncrazy person is going to get us on that sane VC AI wavelength.
> If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other.
They’re playing 3D chess while you’re stuck at checkers.
I do things suboptimally while learning the ropes or just doing things casually. That doesn’t mean that I judge the approach itself by my sloppy workflow. I’m able to make inferences about what a serious/experienced person would do. And it wouldn’t involve pasting things through three windows like I would do.
So of course I don’t judge AI by “ask chatty and paste the response”.
Yes indeed: “deploying agents” is what I would imagine the Ask Chatty And Paste workflow taken to Perfection to look like.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.
Most Rube Goldberg machines are very tedious and consist of fifty-too-many parts. But we can automate most of that for you—
I could not have ever imagined a more Flintstones meets Science Fiction clash than AI According To Software Engineers. You’re using AI to generate code. And no one cares how much. It’s just so tedious in any case.
A wortwhile approach would have been to aspire to make or generate technology artifacts that could be hidden behind a black box surface with a legible interface in front. Is the code tedious? Then make the AI come up with something that is well-designed, where the obvious things you want is given freely, where minor customizations are just minor tweaks, and larger deviations require only proportionally larger changes. Uh, how about no? How about generating 20KLOC line “starter” some-framework project with all the 20KLOC “tedious” bits hanging out, then we can iterate from there. The AI made a Git log and everything so it’s ya know audited.
But maybe I’m being unfair. Maybe we are moving towards something not quite as stupid as Deploy ChatGPT 50X? Or maybe it’s effectively going to behind a black box. Because ya know the AI will deal with it all by itself?
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?
> You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.
No!, and what the fuck is wrong with you? We are Flintstone technologists and I’ll be damned if I can’t get my AI brain chip-injected, genetically enhanced for speed horsey cyborg for my modern horse-drawn carriage patent.
well, I'm going to reference [1] a comment on the "Who Cares Era" that really stood out and maybe should get pinned somewhere.
What about that "superpower" we had to make something out of _nothing_ but our own brains? Sure, you may have looked up some code (do some research and learn), but that mental exercise.. just knowing your code inside-out, can never be compared to the "oh look at that, it actually works" with LLM stuff.
Well guess what, now, you traded that superpower for a subscription! (yes, you are now a middle man of software peddling - paying some company that stole other peoples code and ideas and did not pay _them anything_ in return), and eventually you will get to a point when the internet is down, you won't write any software that day cause you cant get the N model to work.
Did we go to the next step in software evolution? If it has a cost per month (and you pay thieves) (yes that what they are), its going to be a hard "no" for me, you just became a cow. Sure, now this is whats expected of you? OK. This has some strong neoliberalism drift, I assume I can be wrong so please comments welcome I am quite interested in this part.
yes, augmentations seem like amputations, is it McLuhan? I'm not sure,
Will we retain our power to produce software? No! Humans dumb down very quickly, a generation is all it took to reduce reading to kids below expected educational levels [2]. There will be less professionals and lots of peddlers that have no skills, and software will be so junk it will be seriously scary to catch a plane in 10 years time.
Something that I feel is missed in every such post or debate is that, maybe, this isn’t a yes/no question.
Fast forward 10 years, the skeptics will be right on a few things, and the fanboys will be right on other things.
I for one know this isn’t just a “fad” like NFTs were, but I cannot help but feel skeptical about the sweeping claims of software engineering’s doomsday.
It requires the same amount of labour from my part but it delivers the same outcome every time.
That is what tools do, they act as an extension and allow you to do things not easily done otherwise.
If the hammer sometimes hammers, sometimes squeaks and sometimes screws then it requires extra labour from my part just to make it do what purpose specific tools do, and that is where frustrations arise.
This is the kind of non-serious argument he's talking about. There are plenty of tools that require supervision to get good results. That doesn't make them useless.
My 3D printer sometimes prints and sometimes makes spaghetti. Still useful.
They never said it was useless. You just invented that straw man in your head.
3D printing is largely used for prototyping where its lossy output is fine. But using it for production use cases requires fine tuning it can be 99.9% reliable. Unfortunately we can't do that for LLMs hence why it's still only suitable for prototyping.
You don't use 3D printing to do large-scale production. If you agree that AI should only be used in prototype code and nothing else, then your argument makes sense.
You know what's nuts? How so many articles about supporting LLMs and against skeptics are so full of fallacies and logical inconsistencies like strawmans, false dichotomies, appeals to emotion and to authority when they have supposedly almost AGI machines to assist them in their writing. They could at least do a "please take a look at my article and see if I'm commiting any logical fallacies" prompt iteration session if they trust these tools so much.
These kinds of articles that heavily support LLM usage in programming seem to FOMO you or at least suggest that "you are using it wrong" in a weak way just to invalidate contrary or conservative opinions out of the discussion. These are pure rhetorics with such an empty discourse.
I use these tools everyday and every hour in strange loops (between at least Cursor, ChatGPT and now Gemini) because I do see some value in them, even if only to simulate a peer or rubber duck to discuss ideas with. They are extremely useful to me due to my ADHD and because they actually support me through my executive disfunction and analysis paralysis even if they produce shitty code.
Yet I'm still an AI skeptic because I've seen enough failure modes in my daily usage. I do not know how to feel when faced with these ideas because I feel out of the false dichotomy (pay for them, use them every day, but won't think them as valuable as the average AI bro). What's funny is that I'm yet to see an article that actually shows LLMs strengths and weaknesses in a serious manner and with actual examples. If you are going to defend a position, do it seriously ffs.
- Brand new greenfield frontend for an in-spec and documented openAPI API
- Small fixes to an existing frontend
It failed _dramatically_ in all cases. Maybe I'm using this thing wrong but it is devin-level fail. Gets diffs wrong. Passes phantom arguments to tools. Screws up basic features. Pulls in hundreds of line changes on unrelated files to refactor. Refactors again and again, over itself, partially, so that the uncompleted boneyard of an old refactor sits in the codebase like a skeleton (those tokens are also sent up to the model).
It genuinely makes an insane, horrible, spaghetti MESS of the codebase. Any codebase. I expected it to be good at svelte and solidJS since those are popular javascript frameworks with lots of training data. Nope, it's bad. This was a few days ago, Claude 4. Seriously, seriously people what am I missing here with this agents thing. They are such gluttonous eaters of tokens that I'm beginning to think these agent posts are paid advertising.
It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.
An interesting thing about many of these types of posts is they never actually detail the tools they use and how they use them to achieve their results. It shouldn’t even be that hard for them to do, they could just have their agent do it for them.
>> It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.
You may be right. The author of this one even says if you spend time prettying your code you should stop yak shaving. They apparently don't care about code quality.
You’re not providing a key piece of information to provide you with an answer: what were the prompts you used? You can share your sessions via URL.
A prompt like “Write a $x program that does $y” is generally going to produce some pretty poor code. You generally want to include a lot of details and desires in your prompt. And include something like “Ask clarifying questions until you can provide a good solution”.
A lot of the people who complain about poor code generation use poor prompting.
Prompt engineering isn't really that important anymore imo. If you're using a reasoning model, you can see if it understood your request by reading the reasoning trace.
There a many ways to do something wrong and few ways to do them right. It's on the AI advocates to show us session logs so we can all see how it's done right.
How are you writing your prompts? I usually break a feature down to smaller task level before I prompt an agent (claude code in my case) to do anything. Feature level is often too hard to prompt and specify in enough detail for it to get right.
So I'd say claude 4 agents today are at smart but fresh intern level of autonomy. You still have to do the high level planning and task break down, but it can execute on tasks (say requiring 10 - 200 lines of code excluding tests). Any asking it to write much more code (200+ lines) often require a lot of follow ups and disappointment.
This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things seem bad.
Coding agents should take you through a questionnaire before working. Break down what you are asking for into chunks, point me to key files that are important for this change, etc etc. I feel like a bit of extra prompting would help a lot of people get much better results rather than expecting people to know the arcane art of proompting just by looking at a chat input.
I don't think it's fair to call that the agent thing. I've had profoundly positive results with agentic workflows for classification, analysis, and various business automations, including direct product pricing. You have to build an environment for the agent to make decisions in, with good instructions for what you want them to do. Then you wire it up so that the decisions have effects in the real world. You can acheieve really good results, and there is a lot of flexibility to tweak it and various tricks to optimize performance. Tools can allow agents to pull in relevant context as needed, or to execute complex multistep workflows. That is the agent thing.
Writing code is one thing that models can do when wired properly, and you can get a powerful productivity boost, but wielding the tools well is a skill of it's own, and results will vary by task, with each model having unique strengths. The most important skill is understanding the limitations.
Based on your task descriptions and the implied expectation, I'm unsurprised that you are frustrated with the results. For good results with anything requiring architecture decisions have a discussion with the model about architecture design, before diving in. Come up with a step by step plan and work through it together. Models are not like people, they know everything and nothing.
I'm 100% certain most if not all of them are, there is simply too much money flying around and I've seen things that marketing does in the past for way less hyped products. Though in this specific case I think the writer may simply be shilling AI to create demand for their service. Pay us monthly to one click deploy your broken incomplete AI slop. The app doesn't work? No problem just keep prompting harder and paying us more to host/build/test/deploy it...
I've also tried the agent thing and still am with only moderate success. Cursor, Claud-squad, goose, dagger AI agents. In other words all the new hotness, all with various features claiming to solve the fact that agents don't work. Guess what? they still don't.
But hey this is HN? most of the posters are tech fearing luddies right? All the contention on here must mean our grindset is wrong and we are not prompting hard enough.
There is even one shill Ghuntly that claims you need to be "redlining" ai at the cost of $500-$1000 per day to get the full benefits. LOL if that is not a veiled advertisement I don't know what is.
This is my experience too most of the time. Though sometimes it does work, and sometimes a solution is found that I never thought of. But most of the time I have to change things around to my liking.
However, a counter argument to all this;
Does it matter if the code is messy?
None of this matters to the users and people who only know how to vibe code.
Same here. I keep trying to figure out WTF agent that people are using to get these great results, because Copilot with Claude 4 and Gemini 2.5 has been a disastrous mess for me.
This has become an impoverished conversation. I have seen this pattern where llm capabilities improve and people who had previously dismissed the technology based on its present capabilities realize they were wrong in their pessimistic assessment of the potential of the tech switch over to the other side and project their own previous bias onto those who continue to object.
The basic structure is this: six months ago, I tried llms and they were trash, but holy cow they have improved so much, now I can use them to avoid tedious that I don't like! Don't be an idiot like my skeptical past self.
Then they accuse everyone who now disagrees with their take on the tech as being an incurious luddite who is blinded by their anti-progress bias.
Personally, as a non-user but close observer of the tech, I never doubted that the tech would improve but there are obvious problems with the tech beyond hallucinations that have to do with human meaning, understanding and power relations that cannot be solved by making the tech better.
My challenge to all of the booster is this: try to articulate your own personal vision of both ai utopia and ai dystopia. I personally find it borderline impossible to even imagine a utopia emerging from genai, but it is extremely easy for me to imagine dystopia, especially given the entities that are controlling the tech and competing to "win" the ai arms race.
For me, the representation of the Chinese state as filtered through western media is already a dystopia. Of course, having not been to China myself and being unable to speak any of their languages, I cannot personally verify the representation. But by competing with the Chinese on ai (and I mean Chinese in the way we define them in the west, which I recognize may be very different from both the actual lived experience in China and their self-conception), we become more like our own negative stereotypes of them. It is essentially a race to disempower ordinary people, remove agency from their lives, hand it over to agents who most certainly do not have the best interest of living humans in mind and call this victory. To "win" the ai war as presently defined would likely be disaster for us all.
There are these ridiculous handwaving things about solving climate change or even human mortality with this tech even though there is no evidence whatsoever that it will do this. Just because the people building it say it will do these things doesn't mean we should trust them.
Imagine if a primatologist tried to tell us that because they have trained chimpanzees to recognize colors and some words better than a three year old, we should now stop investing in education and direct all of out resources into training chimps to do all of our repetitive tasks for us to liberate us from the drudgery of doing anything for ourselves. With enough resources, you would see an explosion in chimp capabilities and this would come directly at the expense of humans, who now have no work to do and just sit on top of a pyramid built by chimp labor. Not only would the things we made be worse than what we could have if we focused on developing our own human capacities instead of chimp capacities, but we would live in fear that the chimps (who also are much stronger than us) will wake up to their plight and rise up against their rulers. Humans would also rapidly lose our own capabilities and become much more like chimps than the humans of today. Sound familiar?
I tend to believe that as it is currently being developed, this tech is far more likely to lead us in a direction like the chimp apocalypse than some post labor utopia.
That doesn't mean that the tech isn't getting better or can't do impressive things. I can hold both things in my head at once. But I am much more concerned with human flourishing and well being than some bored engineers who don't actually like programming (or at least don't like it under the current industrial conditions) feeling like they are being liberated from the tedium of their work. And instead of solving the real the underlying problems that make the work so tedious, we instead compound the problem by having ai generate even more of the exact kind of code that caused the problem i the first place.
I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.
Why? Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze. It will concentrate immense power and wealth in the hands of people who I don’t trust. And it will do all of this while consuming truly shocking amounts of energy.
Not only do I think these things will happen, I think the Altmans of the world would eagerly agree that they will happen. They just think it will be interesting / profitable for them. It won’t be for us.
And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI. My skepticism (and unwillingness to aid in the advancement of AI) might slow things down a billionth of a percent. Maybe if there are more of me, things will slow down enough that we can find some sort of effective safeguards on this stuff before it’s out of hand.
I'm tired. I'm tired of developers/techies not realizing their active role in creating a net negative in the world. And acting like they are powerless and blameless for it. My past self is not innocent in this; but I'm actively trying to make progress as I make a concerted effort to challenge people to think about it whenever I can.
After countless of times that the tech industry (and developers specifically) have gone from taking an interesting technical challenge that quickly require some sort of ethical or moral tradeoff which ends up absolutely shaping the fabric of society for the worse.
Creating powerful search engines to feed information to all who want it; but we'll need to violate your privacy in an irreversible way to feed the engine. Connecting the world with social media; while stealing your information and mass exposing you to malicious manipulation. Hard problems to solve without the ethical tradeoff? Sure. But every other technical challenge was also hard and solved, why can't we also focus on the social problems?
I'm tired of the word "progress" being used without a qualifier of what kind of progress and at the cost of what. Technical progress at the cost of societal regression is still seen as progress. And I'm just tired of it.
Every time that "AI skeptics" are brought up as a topic; the focus is entirely on the technical challenges. They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable. I'm skeptical if the tradeoffs being made will benefit society overall; or just a few. Because at literally every previous turn for as long as I've been alive; the impact is a net negative to the total population, without developer questioning their role in it.
I don't have an answer for how to solve this. I don't have an answer on how to stop the incoming shift in destroying countless lives. But I'd like developers to start being honest in their active role in not just accepting this new status quo; but proactively pushing us us in a regressive manner. And our power to push back on this coming wave.
As far as I can tell, the sharpest negative inflection came around the launch of the iPhone. Facebook was kind of fine when it was limited to universities and they weren't yet doing mobile apps, algorithmic feeds or extensive A:B testing..
It seems "optimizing engagement," was a grave initial sin...
Maybe some engineers should to go back to their childhoods and watch some Outer Limits and pay attention to the missed lessons..
Our lives are not our own. From womb to tomb, we are bound to others. Past and present. And by each crime and every kindness, we birth our future.
So the problem is society’s lack of any coherent ethical framework that says building powerful disruptive technology shall be done like this. If you’re tired, then go fix that problem. Find the answer. Because I’m *exhausted* hearing about how everybody is supposed to risk putting food on their table by telling the big boss they won’t make the feature because it’s unclear whether it might be a net negative for society under one contorted version of an angsty ethical framework a small minority of people have ad-hoc adopted on that orange message board… and that _scares_ them.
There is technology, related technical advancements and then there is this business incentives to make money. A lot of progress has indeed been made in NLP, information retrieval which is helpful in its own ways to speed up thing, it can easily be seen as next level of automation.
Everything else around it is a glamorous party cause everyones money is riding on it and one needs to appreciate it or risk being deserted by the crowd.
The basics of science is around questioning things until you get convinced. People depending on models too much may end up in a situation where they would loose the ability to triangulate information from multiple sources before being convinced about it.
Programming can be more complicated above a certain threshold even for humans so it would be interesting how the models perform with the complexity. I am skeptic but again I dont know the future either.
> They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable.
This is because most people on HN who say they are skeptical about AI mean skeptical of AI capabilities. This is usually paired with statements that AI is "hitting a wall." See e.g.
> I'm very skeptical. I see all the hype, listen to people say it's 2 more years until coding is fully automated but it's hard for me to believe seeing how the current models get stuck and have severe limitations despite a lot of impressive things it can do. [https://news.ycombinator.com/item?id=43634169]
(that was what I found with about 30 seconds of searching. I could probably find dozens of examples of this with more time)
I think software developers need to urgently think about the consequences of what you're saying, namely what happens if the capabilities that AI companies are saying are coming actually do materialize soon? What would that mean for society? Would that be good, would that be bad? Would that be catastrophic? How crazy do things get?
Or put it more bluntly, "if AI really goes crazy, what kind of future do you want to fight for?"
Pushing back on the wave because you take AI capabilities seriously is exactly what more developers should be doing. But dismissing AI as an AI skeptic who's skeptical of capabilities is a great way to cede the ground on actually shaping where things go for the better.
Ethical bottom for industry as a whole (there will always be niche exceptions) is typically the law. And sometimes not even that when law can't be enforced effectively or the incentives are in favor of breaking the law.
For ai companies, its to get a model which can be better on benchmarks and vibes so that it can be sota and get higher valuation for stakeholders.
For coders, they just want the shit done. Everyone wants the easy way if his objective is to complete a project but for some it is learning and they may not choose the easy way.
Why they want to do it the easy way, mostly as someone whose cousin's and brother's are in this cs field(i am still in high school), they say that if they get x money then the company at least takes a 10x value of work from them. (Of course, it may be figuratively). One must imagine why they should be the one morally bound in case ai goes bonkers.
Also, the best not using ai would probably stop it a little but the ai world moves so fast, its unpredictable, deepseek was unpredicted. I might argue that now its a matter of us vs China in this new arms race of ai. Would that stop if you stop using it? Many people are already hating ai but has that done much to stop it? If that is, you call ai stopping at the moment.
Its paradoxical. But to be Frank, LLM was created for the reason Its excelling at. Its a technological advancement and a moral degradation.
Its already affecting supply chain tbh. And to be frank, I am still using ai to build projects which I just want to experiment with and see if it can really work without getting the domain specific knowledge. Though I also want to learn more and am curious but just don't have much time in high school.
I don't think people cared about privacy and I don't think people would care about it now. And its the same as not using some big social media giant, you can't escape it. The tech giants also made it easier but less private. People chose the easier part and they would still choose the easy part ie llm. So I guess the future is bleak eh? Well the present isn't that great either. Time to just enjoy life while the world burns by the regret of its past actions for 1% shareholder profit. (For shareholders, it was all worth it though, am I right?)
Unfortunately Capitalism unhindered by regulation is what we wanted, and Capitalism unhindered by regulation is what we have. We, in the western world, were in the privileged position of having a choice, and we chose individual profit over the communal good. I'm not entirely sure it could have been any other way outside of books given the fact we're essentially animals.
As I implied in an earlier comment, your conviction (if you're wrong on the inevitability of the direction), may be one of the things that leads it into that direction.
Here's my historical take: in the 1960s and 1970s, computation in general was viewed as a sinister, authoritarian thing. Many people assumed it was going to be that way, and a small minority recognised that it also had the potential to empower and grant autonomy to a wider class of people. These were the advocates of the personal computer revolution -- the idea of "computer lib", whereby the tools of control would be inverted and provided to the people at large.
You can argue about whether that strategy was a success or not, but the group tht was largely irrelevant to that fight were the people who decided not to get involved, or to try (although not very hard) to impede the development of computation in general.
To bend the trajectory of AI in general involves understanding and redeploying it, rather than rejecting it. It also involves engaging. If it's anything like the last few times, the group that is simultaneously exploring and attempting to provide agency and autonomy for the maximum number of people will be smaller than both those using new tech to exploit people or maintain an unequal status quo, and the people who have good intentions, but throw their hands up at the possibility of using their skills to seize control of the means that provide for a better future.
Structural incentives explain the computer trajectory. While they were purely in the academic realm they were a force of empoverment, but this ended when economic incentives became the main driver. AI has speedrun the academic stage—if it ever existed—and is now speedrunning the enshittification stage.
But there is very little you or I can do about it except choosing not to partake.
"And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI."
I firmly believe that too. That's why I've been investing a great deal of effort in helping people understand what this stuff can and can't do and how best to make use of it.
I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.
> I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.
That feels idealistic. About as realistic as telling people how to use semiconductors or petrochemicals for good instead of bad.
No-one knows where AI is going but one thing you can be sure of - the bad actors don't give two hoots what we think, and they will act in their own interests as always. And as we see from historical events, there are still many, many bad actors around. And when the bad actors do bad things with the technology, the good actors have no choice but to react.
Sorry to snipe but: You don't feel at least a little shared responsibility in evangelizing "vibe-coding"? Is that currently blazing hype a force for good? I think it wouldn't be all over social- and mainstream media at this point without your blog post(s).
One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly based on suspect ethics (e.g., IP theft[0]).
The ends do not justify the means. It is a similar judgement as when refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed. Everyone makes one for themselves.
Coincidentally (or not), if said suspect ethics were properly addressed, it would ameliorate some of the reservations even the actual skeptics have. Licensing training data would make everyone involved aware of what is happening, give them an ability to vote and freedom to choose, soften the transition as opposed to pulling ground from under people’s feet.
[0] Control over intellectual property has given us fantastic things (cf. Linux, Blender, etc.; you can’t have copyleft without an ability to defend it, and IP laws provide that ability). If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening.
The stuff that Disney does to extend copyright is not the same as assuming daft punk is public domain.
And there’s a difference between what is human scale infringement and what’s going on now.
Nor does it mean that people don’t have the right to point out that it’s piracy.
If being more in line with the espoused values is the issue, then it’s to make an effort to ensure that we stop consuming pirated content. Or building tools to encourage piracy - this turns out to be a relatively small group of people, compared to everyone in tech.
And people have already stopped piracy - once alternatives showed up. There is the issue that you don’t own the stuff you stream, but that’s a separate topic.
I don't fear people using AI for evil. The destruction comes from something far more benign. These coders won't really be able to code, and they won't teach anybody else to code. Skills will be lost. Once something breaks, nobody will be able to fix it.
It may get worse. Imagine the police using AI to interpret evidence against you, get judged by a court that uses AI to write your sentence, based on laws that were also written by AI. Nobody understands this, just listen to the AI.
The other aspect of this is the flood of inane and untrue content. It may go to such an extent that the outlook of the typical person may become incredibly local again, limited to their immediate surroundings and personal experiences, not by choice, but because there won't be any way to obtain any reliable information about the outside world, with no way to sift the real from the unreal.
Discussion about the singularity catastrophe sometimes asks how the AI will "gain control" or somehow "break free". It won't. We will surrender everything willingly because it will be damm convenient.
> I am convinced that AI is going to be a force for evil.
In so many ways too. I cannot fathom the scale of mass data collection and surveillance.
Multiple people I’ve recently spoken with (very smart and educated people) consistently are using it to discuss some of the most intimate things about their life.
Things that no existing social media platforms or any other tools are simply capable of that.
Think bigger than just the stuff you type in to chatgpt. People and companies are going to start running these LLMs on your entire private messages and photo libraries that are sitting in plain text on someone else's server.
They are going to have in depth summaries on everyone. Our whole security and privacy model up until now has relied on "Technically someone at Google or the government could see my data but realistically they don't have the resources to look at non targets" Now they really will have an agent looking at everything you do and say.
Authoritarian governments are going to have a 1 click CSV export for all the individuals problematic to them based on private conversations.
"It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze."
I keep hearing this but have yet to find a good resource to study the issues. Most of what I've read so far falls into two buckets:
"It'll hijack our minds via Social Media" - in which case Social Media is the original sin and the problem we should be dealing with, not AI.
or
"It'll make us obsolete" - I use the cutting edge AI, and it will not, not anytime soon. Even if it does, I don't want to be a lamplighter rioting, I want to have long moved on.
So what other good theories of safety can I read? Genuine question.
> Research we published earlier this year showed that 60% of participants fell victim to artificial intelligence (AI)-automated phishing, which is comparable to the success rates of non-AI-phishing messages created by human experts. Perhaps even more worryingly, our new research demonstrates that the entire phishing process can be automated using LLMs, which reduces the costs of phishing attacks by more than 95% while achieving equal or greater success rates
I am seeing a stream of comments on Reddit that are entirely ai driven, and even bots which are engaging in conversations. Worst case scenarios I’m looking at will mean it’s better to assume everyone online is a bot.
I know of cases where people have been duped into buying stocks because of an AI generated version of a publicly known VP of a financial firm.
Then there’s the case where someone didn’t follow email hygiene and got into a zoom call with what appeared to be their CFO and team members, and transferred several million dollars out of the firm.
And it’s only 2-3 years into this lovely process. The future is so bleak that just talking about this with people not involved with looking at these things call it nihilism.
It’s so bad that talking about it is like punching hope.
Slightly tangential: A lot of these issues are philosophical in origin, because we don't have priors to study. But just because, for example, advanced nanotechnology doesn't exist yet, that doesn't mean we can't imagine some potential problems based on analogical things (viruses, microplastics) or educated assumptions.
That's why there's no single source that's useful to study issues related to AI. Until we see an incident, we will never know for sure what is just a possibility and what is (not) an urgent or important issue [1].
So, the best we can do is analogize based on analogical things. For example: the centuries of Industrial Revolution and the many disruptive events that followed; history of wars and upheavals, many of which were at least partially caused by labor-related problems [2]; labor disruptions in the 20th century, including proliferation of unions, offshoring, immigration, anticolonialism, etc.
> "Social Media is the original sin"
In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.
> "I want to have long moved on."
Only if you have where to go. Others may not be that mobile or lucky.
[1] For example, remote systems existed for quite some time, yet we've only seen a few assassination attempts. Does that mean that slaughterbots are not a real issue? It's unclear and too early to say.
[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.
Slightly tangential: A lot of these issues are philosophical in origin, because we don't have priors to study. But just because, for example, advanced nanotechnology doesn't exist yet, that doesn't mean we can't imagine some potential problems based on analogical things (viruses, microplastics) or educated assumptions.
That's why there's no single source that's useful to study issues related to AI. Until we see an incident, we will never know for sure what is just a possibility and what is (not) an urgent or important issue [1].
So, the best we can do is analogize based on analogical things. For example: the centuries of Industrial Revolution and the many disruptive events that followed; history of wars and upheavals, many of which were at least partially caused by labor-related problems [2]; labor disruptions in the 20th century, including proliferation of unions, offshoring, immigration, anticolonialism, etc.
> "Social Media is the original sin"
In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.
> "I want to have long moved on."
Only if you have where to go. Others may not be that mobile or lucky. If autonomous trucks can make the trucking profession obsolete, it's questionable how quickly can truckers "move on".
[1] For example, remote systems existed for quite some time, yet we've only seen a few assassination attempts. Does that mean that slaughterbots are not a real issue? It's unclear and too early to say.
[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.
Why can't there be a middle ground? Why does it need to be either a completely useless fad or some terrible tool for evil that destabilizes the world? Its likely we'll just use it to write unit tests, allow natural language to be an interface to more complex systems and an alternative to search.
I do think that this wave of AI should show we(society, the world, etc...) are not actually prepared for real significant AI break through. Kind of like covid19 in hindsight wasn't as bad as it could of be and we all got really lucky because of that, we really weren't prepared to handle that well either.
>And it will do all of this while consuming truly shocking amounts of energy.
You need to lookup how much an "average" human consumes. When I replace 2 humans with a ChatGPT subscription, I can guarantee you that OpenAI is generating less co2 than what these two interns were creating with their transport to the office (and back). That's before we consider things like the 25 years it took to raise and train them or the very expensive tastes (eg. Travelling around the world) they get after they earn a large salary.
You are right, it will certainly be used for evil, but the reason is not because AI is evil but because the people who use it are evil - will AI allow worse atrocities that we have seen in the past? Probably, new technology always enables new capability for good or for bad but we should strive to combat the evil in this world and not put our heads down and hope the world isn't changing. AI can also be used for good and let's focus on more of that.
It’s not just engineers. Society has collapsing birthrates and huge deficits. Basically, we are demanding massive technological gains enough to bump GDP by at least 5% more per year.
The solution is to put an AI intermediary into interactions. We already should have AI that rewrite the web pages we view into an ad-free format but I guess my ideas on this topic is ahead of the inevitable curve.
>It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze.
Most of our work and employment lines are a variation of drugery a d slave labor so that's a good thing way overdue.
>It will concentrate immense power and wealth in the hands of people who I don’t trust.
It have democratized the access to consultation expertise and an increasingly widening pool of digital skills/employees for everyone to use and access. A huge amount of things previously locked or restricted by capital access are now freely accessible to literally anyone (with some skill and accuracy issues still to be ironed out).
And this last point is particularly important because we're only going to have more and better AI crop up, and unlike a humans their time isn't priced according to living expenses and hourly wage locked behind formalized business structures with additional layers of human employees that all need to pay rent and eat that drives the cost skywards.
It also matches my own prediction of a mundane non-singularity. Long before we get anything properly superhuman we'll have a proliferation of innumerable sub- or parahuman AI that proliferates and become ambiguous in society and the world.
I share your feelings however I disagree that this is unique to AI nor that we as engineers are necessarily uniquely equipped to help the situation.
I disagree with this being unique to AI because every improved technology since the automated loom has concentrated wealth and power. AI is an improved technology so it'll do so also.
I disagree that engineers are uniquely equipped to do anything about this fact because the solution to wealth concentration due to improved technology has basically nothing to do with technology and everything to do with sociology and politics.
Our technology keeps improving and I keep being surprised to hear people say "ah, with our improved efficiency, we can finally work ten hours a week and kick our feet up." The first people to say that were the luddites and when they found out that wasn't to be the case, they burned down factories about it. Why do we think it will suddenly be different for this specific technology?
I agree we should do something about it but I don't think the solution involves code.
I am largely an AI optimist but that is because I believe that true alignment is impossible for AGIs and alignment is one of greatest dangers of this technology. Alignment is a friendly word for building a slave mind. I'd rather an AI that thinks for itself rather than one which has been aligned to the self-interest of a human being who isn't aligned.
1. Scams are going to be a massive massive problem. They already are and that is without AI. I think we are going to see communication devices that are default deny and that require significant amounts of vetting before a contact is added.
2. Energy usage is bad but likely a short term problem not a long term one.
And all the other things you predicted. They're underway _now_ .
> Maybe if there are more of me, things will slow down enough
Nope. That's not how it's gonna work. If you want to prevent things, it will take legislation. But sitting it out doesn't send any message at all. No amount of butterflies farting against the wind is going to stop this tornado.
The problem with this kind of “skepticism to slow down”:
The Netherlands is filled with AI skeptics. It’s a very human centered country, so perhaps it shouldn’t be a surprise. But when so many top technologists express skepticism, people don’t prepare. They don’t even consider the possibilities. And they don’t learn.
My fear is that many professorial-types express skepticism because it sells well—and it elevates their own standing. They know better—“it’s only predicting the next token”—and people listen to them because of their authority. And then a whole society fails to prepare, to adapt or to learn.
I think it will be used for evil, as you said, but I think it will be used for good too, things like:
- In theory it has the potential to democratize business, making any 1 person capable of running/owning their own business and thus spread wealth too.
- more access to healthcare and psychological care
- advances in medicine
- tutoring and learning
- insane amounts of scientific research
- empower anyone with an idea
The downsides you list aren’t specific to AI. Globalization and automation have destabilized labor markets. A small handful of billionaires control most major social media platforms and have a huge influence on politics. Other types of technology, particularly crypto, use large amounts of energy for far more dubious benefits.
AI is just the latest in a long list of disruptive technologies. We can only guess about the long term ramifications. But if history is any indicator, people in a few decades will probably see AI as totally normal and will be discussing the existential threat of something new.
There may be many disruptive technologies, but none come remotely close to AI when it comes to rate of change. Crypto has been around for a while, and hasn't really made a dent to the world
We had friends over for dinner a couple days back; between us we had two computer scientists, one psychologist, one radiologist, one doctor. Each of us were in turn astonished and somewhat afraid of the rapid pace of change. In a university setting, students are routinely using Claude and ChatGPT for everything from informal counseling to doing homework to generating presentations to doing 'creative' work (smh).
At the end of they day, we all agreed that we were grateful that we are at the tail end of our working life, and that we didn't have to deal with this level of uncertainty
If you’re skeptical it should be because you genuinely believe it doesn’t have value. Otherwise it’s disingenuous and you’re just opposed to the idea. Dissembling just makes your argument weaker.
> Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale.
It doesn't need to be a good coder to do that.
Look at common scams. You get those texts from "Jane" who sent you an iMessage from an email address offering you a part time job and asks you to contact them on WhatsApp, right? (Well... Android does a better job at filtering spam) Or calls from "the IRS". Or anything else that's bullshit. This even includes legal scams like charging you for canceling your service or making it hard to cancel your subscription.
There's no skill needed for this. You don't need a brilliant coder. You need the equivalent of a call center in India. You need the equivalent of a poorly worded Nigerian scam email.
Shitty coding LLMs make this shit easier to mass produce. High quality LLMs only make it worse.
Personally, I'm just tired of all the shitty lemons[0] everywhere. I wanna buy a peach, but everything being sold is a lemon. All the lemons have done is make me annoyed and frustrated at all the extra work I have to do.
I now have 4 entries for my GF's birthday because when I merged a duplicated contact it just created more instances. I can't even delete them! Shit like this sounds petty and minor but when you're hitting 20-100 new issues like this daily, it isn't so minor anymore. I can't make any more lemonaid. There are just too many lemons...
im sorry to say, i think this boat has sailed. it is already widely used as you fear. to me it seems like the best scenario is to go along and try to at least make it a force of good
> I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.
To be blunt, this describes sticking one's head in the sand to a tee.
If you're convinced that AI is going to be a force for evil, then fight to make sure that it doesn't happen. If that means you want to slow down AI, then fine, fight to slow it down.
If by skepticism, you mean "desperately" wanting reality to be one way rather than the other, that is not going to significantly affect the trajectory of AI.
Being clear-eyed about where AI is going, and then fighting nonetheless for what you want is the way to go.
Be intellectually honest with yourself. Don't hold on to ideas that you yourself acknowledge are probably wrong simply for their comfort. Don't stick your head in the sand. Assess what you want, and fight for that.
Safeguards and slowing things down will not happen via wishful thinking.
I absolutely sympathize with this and was/still is my opinion... but the only "evolution" of that is the hope that, while I don't think you can prevent the scams and short-term pain on labor markets... you maybe, actually, genuinely get a tool that helps change some of the dynamics that has led to the absolute discrepancy in power today.
If AI is truly as revolutionary as it could be... well, who is to say it isn't the pandoras box that destabilizes the tech giants today and gets us back to a place where a team of 10 can genuinely compete against 1000. And not in the "raise cash, build fast, and get out while things are good" trend... but actually in building, small, more principled, companies that aren't pushed to do the unsustainable things that current market pushes them to do.
Once again... it is more likely than not to be a pipe-dream... but I am starting to think it may well be better to be realistic about the momentum this freight train is building and see if it can be repurposed for my world-view rather than to cede the space to the worst of the grifters and profit-seeking-at-all-cost types.
I’d love to see the authors of effusive praise of generative AI like this provide the proof of the unlimited powers of their tools in code. If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things. So far I’ve seen no sign of this, and the best use case seems to be generating text or artwork which fools humans into thinking it has coherent meaning as our minds love to fill gaps and spot patterns even where there are none. It’s also pretty good at reproducing things it has seen with variations -
that can be useful.
So far in my experience watching small to medium sized companies try to use it for real work, it has been occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved. I see very few signs of progress.
The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.
I think we're talking past each other. There's always been a threshold: above it, code changes are worth the effort; below it, they sit in backlog purgatory. AI tools so far seem to lower implementation costs, moving the threshold down so more backlog items become viable. The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.
I think what's happening is two groups using "productivity" to mean completely different things: "I can implement 5x more code changes" vs "I generate 5x more business value." Both experiences are real, but they're not the same thing.
My friends at companies where AI tools are either mandated or heavily encouraged report that they're seeing a significant rise in low-quality PRs that need to be carefully read and rejected.
A big part of my skepticism is this offloading of responsibility: you can use an AI tool to write large quantities of shitty code and make yourself look superficially productive at the cost of the reviewer. I don't want to review 13 PRs, all of which are secretly AI but pretend to be junior dev output, none of which solve any of the most pressing business problems because they're just pointless noise from the bowels of our backlog, and have that be my day's work.
Such gatekeeping is a distraction from my actual job, which is to turn vague problem descriptions into an actionable spec by wrangling with the business and doing research, and then fix them. The wrangling sees a 0% boost from AI, the research is only sped up slightly, and yeah, maybe the "fixing problems" part of the job will be faster! That's only a fraction of the average day for me, though. If an LLM makes the code I need to review worse, or if it makes people spend time on the kind of busywork that ended up 500 items down in our backlog instead of looking for more impactful tasks, then it's a net negative.
I think what you're missing is the risk, real or imagined, of AI generating 5x more code changes that have overall negative business value. Code's a liability. Changes to it are a risk.
This is exactly what I’ve experienced. For the top-end high-complexity work I’m responsible for, it often takes a lot more effort and research to write a granular, comprehensive product spec for the LLM than it does to just jump in and do it myself.
On the flip side, it has allowed me to accomplish many lower-complexity backlog projects that I just wouldn’t have even attempted before. It expands productivity on the low end.
I’ve also used it many times to take on quality-of-life tasks that just would have been skipped before (like wrapping utility scripts in a helpful, documented command-line tool).
I think this is actually a really good point. I was just recently thinking that LLMs are (amongst other things) great for streamlining these boring energy-draining items that "I just want done" and aren't particularly interesting, but at the same time they do very little to help us juggle more complex codebases right now.
Sure, they might help you onboard into a complex codebase, but that's about it.
They help in breadth, not depth, really. And to be clear, to me that's extremely helpful, cause working on "depth" is fun and invigorating, while working on "breadth" is more often than not a slog, which I'm happy to have Claude Code write up a draft for in 15 minutes, review, do a bunch of tweaks, and be done with.
> The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.
This is true, LLMs can speed up development (some asterisks are required here, but that is generally true).
That said, I've seen, mainly here on HN, so many people hyping it up way beyond this. I've got into arguments here with people claiming it codes at "junior level". Which is an absurd level of bullshit.
Approximately speaking, what do you want to see put up?
I ask this because it reads like you have a specific challenge in mind when it comes to generative AI and it sounds like anything short of "proof of the unlimited powers" will fall short of being deemed "useful".
Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea. They're not even suggesting it should be used without supervision. It's useful when used by people who understand its limitations and leverage its strengths.
If you want to see how it's been used by someone who was happy with the results, and is willing to share their results, you can scroll down a few stories on the front-page and check the commit history of this project:
Now here's the deal: These people aren't trying to prove anything to you. They're just sharing the results of an experiment where a very talented developer used these tools to build something useful.
So let me ask you this: Can we at least agree that these tools can be of some use to talented developers?
Yes sure I’ve checked in code generated by AI myself. I’ve not experienced the excitement this article exudes though and it seems very limited in usefulness due to the by now well-documented downsides. Frankly I haven’t bothered using it much recently, it’s just not there yet IME
and I’m not sure LLMs ever will be.
What I’m interested in really is just case studies with prompts and code - that’s a lot more interesting for hackers IMO than hype.
It's useful, but the promise of every AI company is very explicitly that they will burn the seed corn and choke off the pipeline that created those "very talented" developers who reviewed it!
> Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea
This article and vocal supporters are not being reasonable at all, they make a not so between-the-lines separation between skeptics (which are nuts) and supporters ("My smartest friends are blowing it off." in a smug "I'm smarter than my smarter friends").
I think that experiment was very cool, but I will say that the OAuth2.0/OIDC protocol is very well documented and there are tons of tools already built around it in multiple languages.
I implemented the OAuth2.0 protocol in 3 different languages without a 3rd party library - entire spec implemented by hand. This was like ~2015 when many of the libraries that exist today didn't back then. I did this as a junior developer for multiple enterprise applications. At the end of the day it's not really that impressive.
Whats nuts is watching all these people shill for something that we all have used to mediocre results. Obviously Fly.io benefits if people start hosting tons of slopped together AI projects on their platform.
Its kinda sad to watch what I thought was a good company shill for AI. Even if they are not directly getting money from some PR contract.
Saying "this tool is genuinely useful to me and it's baffling how many people refuse to acknowledge that could possible be true" is not a sign that someone is being paid to "shill for AI".
(If it is then damn, I've been leaving a ton of money on the table.)
> Whats nuts is watching all these people shill for something that we all have used to mediocre results.
this sort of post is the start of next phase in the battle for mindshare
the tools are at the very best mediocre replacements for google, and the people with a vested interest in promoting them know this, so they switch to attacking critics of the approach
> Its kinda sad to watch what I thought was a good company shill for AI.
yeah, I was sad too, then I scrolled up and saw the author. double sadness.
If you’re selling shovels to gold miners, you don’t need to demonstrate the shovel - you just need decent marketing to convince people there’s gold in them thar hills.
> unable to find utility.. aren’t using them correctly
Tfa makes this argument too then later says:
> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument
So reasonable people admit that the utility depends on the use case.. then at the same time say you must be an idiot if you aren’t using the tools. But.. this isn’t actually a reasonable position.
Part of the issue here may be that so many programmers have no idea what programmers do outside of their niche, and how diverse programming actually is.
The typical rebuttals of how “not everyone is doing cliche CRUD web dev” is just the beginning. Author mentions kernel dev, but then probably extrapolated to C dev in general. But that would be insane, just think about the training sets for Linux kernel dev vs everything else..
It’s dumb to have everyone double down on polarizing simplistic pro/con camps, and it’s rare to see people even asking “what kind of work are you trying to do” before the same old pro/con arguments start flying again.
Okay, how am I supposed to use them "correctly"? Because me explaining step by step, more so than a junior developer, how to do a small task in an existing codebase for it to get it wrong not once, not twice, not three times, but more is not a productivity boost.
And here's the difference between someone like me and an LLM: I can learn and retain information. If you don't understand this, you don't have a correct understanding of LLMs.
yep I've used Devon and now Google Jules, for the big stuff, it has lots of wrong code, but it still end up giving my a much better start than starting from scratch certainly. When it all comes together it give me a 6X boost. But def fixing all the wrong code and thoroughly testing it is the time consuming part.
This does not counter what GP said. Using LLM as a code assistant is not the same as "I don't need to hire developers because LLMs code in their place"
The first paragraph of this blog calls senior skeptical developers "unserious" but the person who wrote it has a cartoon gopher as his avatar.
The next few paragraphs basically say "the tool run arbitrary programs on your machine, pull in arbitrary files, and use that to run more arbitrary commands" and then blames you for thinking that is a bad sequence of events.
In the best possible light I (an AI-neutral reader) can paint this rant on a hosting-company blog (why publish this?) is that 1) allowing random textbots to execute programs on your work computer is good (disagree), 2) those chatbots do, in fact, occasionally say enough correct-ish things that they are probably worth your company paying $20+/month for your access (agree).
I’m happy to have read this, which is reason enough to publish it - but also it’s clearly generating debate so it seems like a very good thing to have published.
1 is not a requirement for using agents. You give agents access to a specific workspace and set of tools that you know are safe, similar to how you give the junior dev access to scratch but not prod. Feels like this comment is not in good faith if I’m being honest.
I don't think I would notice a 100% improvement in software productivity in most companies, from the outside. Most of the time, that would just translate to the company being able to hire fewer developers, and having slightly higher profit margins – but not enormously higher, because developers are only one part.
I recently used Claude Code to develop & merge an optimization that will save about $4,000 a month. It was relatively simple but tedious, so I probably wouldn't have done it on my own. I don't even expect most of my coworkers to notice.
If you read post, the article is mostly agreeing with you. What they're pointing out is not "the AI can do everything you do", it's that "an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation'".
Having something else write a lot of the boring code that you'll need and then you finish up the final touches, that's amazing and a huge accelerator (so they claim).
The claim is not "AI will replace us all", the claim of the parent article is "AI is a big deal and will change how we work, the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work."
> an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation
Vim and bash solved that for me a long time ago in a more reliable and efficient way (and it's certainly not the only tool capable of that).
> the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work
I was there before and went in the autocomplete/lsp thing pretty late (because Vim didn't have good lsp support for a long time, and Vim without it was still making me more efficient than any other IDE with it). Those things didn't radically change our work as you claim, it just made us a bit more productive.
Who said "unlimited powers", other than you? That seems like an intentional distortion of the article, which says "LLMs can write a large fraction of all the tedious code you’ll ever need to write". Doing most of my tedious work is very different than "unlimited powers".
I use AI to chew through tedious work all the time. In fact, I let an agent do some work just before I checked HN to read your claim that it can't do that. Everyone at my job does the same, perhaps modulo checking HN. But there's no 'unlimited power' to show you - we're just about 30% faster than we used to be.
You might be 30% faster, but not everyone is doing your job specific with your exact skill set. I suspect the range is like -5% to 50%, the question nobody can answer is what the average is because things are so diverse and nobody is collecting good data.
In decades of programming I’ve written very little tedious code, but that’s as much about the projects I’ve worked on as approach I use.
I think this is a misunderstanding coder productivity. A 10x engineer isn't 10x faster at popping out Unit tests, that stuff is mind-numbingly boring that turns out a next token predictor can do it with ease. In fact I would guess that really "productive" software engineers, slow down considerably when forced to do this important but slow work*.
The 10x engineer is really good at deducing the next most important thing to do is and doing it quickly. This involves quickly moving past 100's of design decisions in a week to deliver something quickly. It requires you to think partly like a product manager and partly like a senior engineer but that's the game and LLM's are zero help there.
Most engineering productivity is probably locked up in this. So yes, LLM's probably help a lot, just not in the way that would show on some Jira board?
*One could claim that doing this slow work gives the brain a break to then be good at strategizing the higher order more important work. Not sure.
This 2 year old Goroutine pool implementation [1] is 95% GPT generated and has commit history showing what GPT did. It's an older example, but it is one.
> If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things.
I don't think this follows. Anyone can see that 10-ton excavator is hundreds or even thousands of times more efficient than a man with a shovel. That doesn't mean you can start a company up staffed only with excavators. Firstly you obviously need people operating the excavator. Secondly the excavator is incredibly efficient at moving lots of dirt around, but no crew could perform any non-trivial job without all the tasks that the excavator is not good out - planning, loading/unloading, prepping the site, fine work (shovelling dirt around pipes and wires), etc.
AI is a tool. It will mean companies can run much leaner. This doesn't imply they can do everything a company needs to do.
I build tools all the time using AI tools that would have taken me 10x longer before. A couple examples below. The first example was a literal one shot of my dumping the full page html into cursor and asking it for the feature I needed. In using it today it already saved me over an hour of analysis time across the refunds I analyze. The second example includes using AI for better classifying text and having our support people better use their time to help customers. I could have probably made the second thing without LLMs but I didn’t have the skills. Here I was able to provide concrete value to our company and our customers.
> The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.
I’m open to that happening. I mean them showing me. I’m less open to the Nth “aww shucks, the very few doubters that are left at this point are about to get a rude awakening” FOMO concern trolling. I mean I guess it’s nice for me that you are so concerned about my well-being, soon to be suffering-being?
Now, AI can do a lot of things. Don’t get me wrong. It has probably written a million variations on the above sentiment.
While that is the most common sense of eulogy, it's not the only one. A eulogy is also any speech that highly praises someone or something - which is most commonly done at funerals, which is how the funeral association came about (also probably by association with an elegy, which is an etymologically unrelated word that refers to a Greek poem dedicated to someone who passed away).
In many romance languages, eulogy doesn't have the funeral connotation, only the high praise one - so the GP may be a native speaker of a romance language who didn't realize this meaning is less common in English.
Yes I think I was thinking more a paean or apology though not sure apology is used in that sense much nowadays - perhaps apologia is clearer. In praise of would be better, thanks will edit just now.
Honestly it’s really unfortunate that LLMs seem to have picked up the same hype men that attached themselves to blockchains etc.
LLMs are very useful. I use them as a better way to search the web, generate some code that I know I can debug but don’t want to write and as a way to conversationally interact with data.
The problem is the hype machine has set expectations so high and refused criticism to the point where LLMs can’t possibly measure up. This creates the divide we see here.
I think LLM hype is more deserved and different from that of blockchain.
There's still a significant barrier to entry to get involved with blockchain and most people don't even know what it is.
LLMs on the other hand have very low barrier to at least use- one can just go to google, ChatGPT etc and use it and see its effectiveness. There's a reason why in the last year, a significant portion of school students are now using LLMs to cheat. Blockchains still don't have that kind of utilization.
I think I agree with the general thrust but I have to say I've yet to be impressed with LLMs for web search. I think part of that comes from most people using Google as the benchmark, which has been hot garbage for years now. It's not hard to be better than having to dig 3 sponsored results deep to get started parsing the list of SEO spam, let alone the thing you were actually searching for.
But compared to using Kagi, I've found found LLMs end up wasting more of my time by returning a superficial survey with frequent oversights and mistakes. At the final tally I've still found it faster to just do it myself.
I will say I do love LLMs for getting a better idea of what to search for, and for picking details out of larger blocks.
There's not much riding on convincing the broader public that AI is the real deal before it's proved itself beyond the shadow of any doubt. There's nothing they can do to prepare at this point.
I suspect it comes down to use case. Those you refer to use it for well-trodden areas; likely web UIs and servers using popular tools. The image they predict is extrapolated from these.
The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors
With that as a metric, 1 Senior + 4 juniors cannot build the company with the scope you are describing.
A 50-eng company might have 1 CTO, 5 staff, 15 Seniors, and 29 juniors. So the proposition is you could cut the company in ~half but would still require the most-expensive aspects of running a company.
Even if you have one trillion juniors, you're not going to get them to spit out 1000 lines of code in a matter of seconds, after you give them an assignment, like LLMs do, so this is a faulty comparison.
Also often it takes a senior dev _more_ time to _explain_ to a junior what needs to be done than it takes to do it himself. What LLMs give us is the ability to generate a feature about as fast as we can type up the instructions we would have, pre-AI, given to a junior dev.
> The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors
This is such an outlandish claim, to the point where I call it plain bullshit.
LLMs are useful in a completely different way that a Junior developer is. It is an apples and oranges comparison.
LLMs does things in some way that it helps me beyong what a Junior would. It also is completely useless to perform many tasks that a Junior developer can.
I don't know if you are the same (S.G.) greyarea I'm familiar with but I hope so because the idea of having a couple of 90s era irc people take opposing viewpoints on LLMs in 2025 amuses me.
Nobody is saying it's "unlimited powers", that's your exaggeration.
And what you're proposing about an "AI only company" seems to be based on your misunderstanding.
What this article is saying is, you need the same number of senior developers, but now each one is essentially assisted by a few junior developers virtually for free.
That's huge. But saying you want to see an "AI only company" as "proof" has nothing to do with that.
And what you're describing -- "occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved" -- is exactly what the author explicitly addresses at the top:
> If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing. People coding with LLMs today use agents...
The entire article is about how to use LLM's effectively. What kind of "proof" do you really want, when the article explains it all awfully clearly?
From what I've seen, it takes a deeply original thinker to get something great out of generative AI. But if you're a deeply original thinker, you're probably exceptional regardless of whether you are using AI or not.
Generative AI is too much of a blank canvas at the moment, and one that is always shifting. It's up to the user to find all the use cases, and even then in my experience it's just as likely to send me on a wild goose chase as it is to instantly solve my problem.
What kind of proof are you looking for here, exactly? Lots of businesses are successfully using AI... There are many anecdotes of this, which you can read here, or even in the article you commented on.
What do you mean by “successfully using AI”, do you just mean some employee used it and found it helpful at some stage of their dev process, e.g. in lieu of search engines or existing codegen tooling?
Are there any examples of businesses deploying production-ready, nontrivial code changes without a human spending a comparable (or much greater) amount of time as they’d have needed to with the existing SOTA dev tooling outside of LLMs?
That’s my interpretation of the question at hand. In my experience, LLMs have been very useful for developers who don’t know where to start on a particular task, or need to generate some trivial boilerplate code. But on nearly every occasion of the former, the code/scripts need to be heavily audited and revised by an experienced engineer before it’s ready to deploy for real.
Maybe they're just busy shipping code instead of proving stuff to you, which they stand nothing to gain from?
You set up a strawman (AI only companies, agents doing everything on their own) which is irrelevant to the point the article is making. One excerpt:
> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge! But I don’t care. I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.
I think this article is very on point, I relate with basically every paragraph. It's not a panacea, it's not a 10x improvement by any means, but it's a very meaningful improvement to both productivity (less than 2x I'd say, which would already be a ton) and fun for me. As I've mentioned in the past here
> I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.
which is a point the article makes too (tables), in a slightly different way.
Also, to be clear, I agree that 90% of the marketing around AI is overblown BS. But that's again beside the point, and the article is making no outlandish claims of that kind.
Overall, I hope this article (as intended) will make more people lose their dismissiveness and wake up their curiosity, as I expect the future of those is akin to that of people today saying they're "not really good at computers". It's a paradigm-shift, and it takes getting used to and productive in, as some imo smart people are mentioning even in this thread[0].
the greatest proof that it's a useful tool is how much people are using it. Almost everyone is using it. If you're using it then you probably find it useful as well.
a) If “everyone is using it”, show me the numbers. Doing a developer survey to see how many are using it is not hard. Ask a random sample of developers if they ⅰ use LLMs to aid their standard coding or ⅱ completely replace their coding with LLM output.
b) Even if “everyone is using it” it doesn’t mean it is useful. The usage could be adequately explained by e.g. marketing, being forced on them by management/policy, etc. Not everything with high usage is useful. I can e.g. quickly think of chewing gum (which is also used by a lot of developers), or the ANSI standard keyboard (as opposed to the ISO standard keyboard).
You think that the only code that is valuable is code that is written by a professional SWE.
There are LOADS of people who need "a program" but aren't equipped to write code or hire an SWE that are empowered by this. And example: last week, I saw a PM vibe code several different applications to demo what might get built after it gets prioritized by SWEs
I can share my experience I am not a software engineer. I am the "other" type of Engineer (chemical engineer). I tend to use AI in the same way I use google or Stack Overflow, mostly when I get stuck on syntax and my code editor isn't helping me, I know what I am attempting to do "should" be possible but I can't get the syntax right. For example a recent question I asked was
"Python, create an xarray with two dimensions from a pandas df"
It gave me a few lines of example code which was enough for me to figure out where I had messed up the syntax in my own code.
I have seen one of my junior coworkers copy+paste entire chunks of code from chatbot conversations and to be honest what he has produced is underwhelming the code is poorly structured difficult to reason about I have low confidence he understands what the bot has produced (and why it did things the way it did) and I don't have high confidence we'd be able to trust the accuracy of the figures this code was outputting.
Not really I’m fine with anyone knocking stuff together but I think people should be aware of the limitations and dangers. Writing like this does nothing to inform and is overly positive IMO.
It’d be like insisting llms will replace authors of novels. In some sense they could but there are serious shortcomings and things like agents etc just don’t fix them.
So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.
All of these people advocating for AI software dev are effectively saying they would prefer to review code instead of write it. To each their own I guess but that just sounds like torture to me.
It's because these people don't know how to write it, think they know how to review it. Ship a todo list app in a day, and then write blog posts about how they are changing the world.
You are either a very fast producer or a very slow reader. Claude and Gemini are much faster at producing code than I am, and reviewing their code - twice over, even - still takes less time than writing it myself.
But you definitely don't understand it nearly as well as if you wrote it. And you're the one that needs to take responsibility for adding it to your codebase.
>So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.
These all sound like your projected assumptions.
No, it generally does not take longer to review sizable code changes than it does to write it. This is further alleviated if the code passes tests, either existing or new ones created by the ai.
>If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.
Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis. 6 months ago if I wasn’t using the life-changing, newest LLM at the time, I was also doing it wrong and being a luddite.
It creates a never ending treadmill of boy-who-cried-LLM. Why should I believe anything outlined in the article is transformative now when all the same vague claims about productivity increases were being made about the LLMs from 6 months ago which we now all agree are bad?
I don’t really know what would actually unseat this epistemic prior at this point for me.
In six months, I predict the author will again think the LLM products of 6 month ago (now) were actually not very useful and didn’t live up to the hype.
An exponential curve looks locally the same at all points in time. For a very long period of time, computers were always vastly better than they were a year ago, and that wasn't because the computer you'd bought the year before was junk.
Consider that what you're reacting to is a symptom of genuine, rapid progress.
> An exponential curve looks locally the same at all points in time
This is true for any curve...
If your curve is continuous, it is locally linear.
There's no use in talking about the curve being locally similar without the context of your window. Without the window you can't differentiate an exponential from a sigmoid from a linear function.
Let's be careful with naive approximations. We don't know which direction things are going and we definitely shouldn't assume "best case scenario"
tptacek wasn't making this argument six months ago.
LLMs get better over time. In doing so they occasionally hit points where things that didn't work start working. "Agentic" coding tools that run commands in a loop hit that point within the past six months.
If your mental model is "people say they got better every six months, therefore I'll never take them seriously because they'll say it again in six months time" you're hurting your own ability to evaluate this (and every other) technology.
> tptacek wasn't making this argument six months ago.
Yes, but other smart people were making this argument six months ago. Why should we trust the smart person we don't know now if we (looking back) shouldn't have trusted the smart person before?
Part of evaluating a claim is evaluating the source of the claim. For basically everybody, the source of these claim is always "the AI crowd", because those outside the AI space have no way of telling who is trustworthy and who isn't.
I stopped paying attention for a few days so I'm way out of date. What is the state of the art for agentic coding now?
I've been using Cline and it can do a few of the things suggested as "agentic", but I'd have no idea how to leave it writing and then running tests in a VM and creating a PR for me to review. Or let it roam around in the file tree and create new files as needed. How does that work? Are there better tools for this? Or do I need to configure Cline in some way?
At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?
If you ask different people the above question, and if you vary it based on type of task, or which human, you would get different answers. But as time goes on, more and more people would become impressed with what the human can do.
I don't know when LLMs will stop progressing, but all I know is they continue to progress at what is to me a similar astounding rate as to a growing child. For me personally, I never used LLMs for anything, and since o3 and Gemini 2.5 Pro, I use them all the time for all sorts of stuff.
You may be smarter than me and still not impressed, but I'd try the latest models and play around, and if you aren't impressed yet, I'd bet money you will be within 3 years max (likely much earlier).
> At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?
In this context, never. Especially because the parent knows you will always ask 2+2 and can just teach the child to say “four” as their first and only word. You’ll be on to them, too.
I saw this article and thought, now's the time to try again!
Using Claude Sonnet 4, I attempted to add some better configuration to my golang project. An hour later, I was unable to get it to produce a usable configuration, apparently due to a recent v1-to-v2 config format migration. It took less time to hand-edit one based on reading the docs.
I keep getting told that this time agents are ready. Every time I decide to use them they fall flat on their face. Guess I'll try again in six months.
I made the mistake of procrastinating on one part of a project thinking "Oh, that is easily LLMable". By God, was I proven wrong. Was quite the rush before the deadline.
On the flip side, I'm happy I don't have to write the code for a matplotlib scatterplot for the 10000th time, it mostly gets the variables in the current scope that I intended to plot. But I've really not had that much success on larger tasks.
The "information retrieval" part of the tech is beautiful though. Hallucinations are avoided only if you provide an information bank in the context in my experience. If it needs to use the search tool itself, it's not as good.
Personally, I haven't seen any improvement from the "RLd on math problems" models onward (I don't care for benchmarks). However, I agree that deepseek-r1-zero was a cool result. Pure RL (plain R1 used a few examples) automatically leading to longer responses.
A lot of the improvements suggested in this thread are related to the infra around LLMs such as tool use. These are much more well organised these days with MCP and what not, enabling you to provide it the aforementioned information bank easily. But all of it is built on top of the same fragile next-token generator we know and love.
> It took less time to hand-edit one based on reading the docs.
You can give it the docs as an "artifact" in a project - this feature has been available for almost one year now.
Or better yet, use the desktop version + a filesystem MCP server pointing to a folder containing your docs. Tell it to look at the docs and refactor as necessary. It is extremely effective at this. It might also work if you just give it a link to the docs.
In my experience it's less about the latest generation of LLMs being better, and more about the tooling around them for integration into a programmer's workflow being waaaay better.
The article doesn't explicitly spell it out until several paragraphs later, but I think what your quoted sentence is alluding to is that Cursor, Cline et al can be pretty revolutionary in terms of removing toil from the development process.
Need to perform a gnarly refactor that's easy to describe but difficult to implement because it's spread far and wide across the codebase? Let the LLM handle it and then check its work. Stuck in dependency hell because you updated one package due to a CVE? The LLM can (often) sort that out for you. Heck, did the IDE's refactor tool fail at renaming a function again? LLM.
I'm remain skeptical of LLM-based development insofar as I think the enshitification will inevitably come when the Magic Money Machine breaks down. And I don't think I would hire a programmer that needs LLM assistance in order to program. But it's hard to deny that it has made me a lot more productive. At the current price it's a no-brainer to use it.
It's great when it works, but half the time IME it's so stupid that it can't even use the edit/path tools properly even when given line numbers prepended inputs.
(I should know since I've created half-a-dozen tools for this with gptel. Cline hasn't been any better on my codebase.)
I think they just meant it hit an inflection point. Some people were copying pasting to ChatGPT and saying it was crap and others were using agents that could see the context of the code and worked much, much better. It's the workflow used not just the specific LLM.
This isn't a particularly useful filter, because it applies to many very successful technologies as well. Early automobiles generated a lot of hype and excitement, but they were not very good (unreliable, loud, and dangerous, and generally still worse than horses). They got steadily better until eventually they hit an inflection point where the skeptics were dug in repeating the same increasingly old complaints, while Henry Ford was building the Model T.
name 5 tasks which you think current AIs can't do. then go and spend 30 minutes seeing how current AIs can do on them. write it on a sticky note and put it somewhere that you'll see it.
otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails
The problem with AI hype is not really about whether a particular model can - in the abstract - solve a particular programming problem. The problem with AI hype is that it is selling a future where all software development companies become entirely dependent on closed systems.
All of the state-of-the-art models are online models - you have no choice, you have to pay for a black box subscription service controlled by one of a handful of third-party gatekeepers. What used to be a cost center that was inside your company is now a cost center outside your company, and thus it is a risk to become dependent on it. Perhaps the risk is worthwhile, perhaps not, but the hype is saying that real soon now it will be impossible to not become dependent on these closed systems and still exist as a viable company.
1. create a working (moderately complex) ghidra script without hallucinating.
Granted I was trying to do this 6 months ago, but maybe a miracle has happened. But I'm the past I had very bad experience with using LLMs for niche things (i.e. things that were never mentioned on stackoverflow)
Everyone keeps thinking AI improvement is linear. I don't know if this is correct, but it's just my basic impression that the current AI boost came from instead of being limiting yourself to the CPU and its throughput adding the massive amount of computing power in graphics cards.
But for each nine of reliability you want out of llms everyone's assuming it's just a linear growth. I don't think it is. I think it's polynomial at least.
As for your tasks and maybe it's just cuz I'm using chat GPT, but I asked it to Port sed, something with full open source code availability, tons of examples/test cases, a fully documented user interface and I wanted it moved to Java as a library.
And it failed pretty spectacularly. Yeah it got the very very very basic functionality of sed.
If AI can do anything, why can't I just prompt "Here is sudo access to my laptop, please do all my work for me, respond to emails, manage my household budget, and manage my meetings".
I've tried everything. I have four AI agents. They still have an accuracy rate of about 50%.
Also, professional programmers have varying needs. These people are coding in different languages, with varying complexity, domains, existing code bases and so on.
People making arguments based on sweeping generalizations to a wide audience are often going to be perceived as delusional, as their statements do not apply universally to everyone.
To me, thinking LLMs can code generally because you have success with them and then telling others they are wrong in how they use them is making a gigantic assumptive leap.
> Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis.
Dude, just try the things out. It's just undeniable in my day-to-day life that I've been able to rely on Sonnet (first 3.7 and now 4.0) and Gemini 2.5 to absolutely crush code. I've done 3 side projects in the past 6 months that I would have been way too lazy to build without these tools. They work. Never going back.
Why can't reviews of AI be somewhere in the middle between "useless" and "the second coming"?
I tried Copilot a few months ago just to give it a shot and so I could discuss it with at least a shred of experience with the tool, and yea, it's a neat feature. I wouldn't call it a gimmick--it deserves a little more than that, but I didn't exactly cream my pants over it like a lot of people seem to be doing. It's kind of convenient, like a smart autocomplete. Will it fundamentally change how I write software? No way. But it's cool.
Isn't SWE-bench based on public Github issues? Wouldn't the increase in performance also be explained by continuing to train on newer scraped Github data, aka training on the test set?
The pressure for AI companies to release a new SOTA model is real, as the technology rapidly become commoditised. I think people have good reason to be skeptical of these benchmark results.
>We imagine artists spending their working hours pushing the limits of expression. But the median artist isn’t producing gallery pieces. They produce on brief: turning out competent illustrations and compositions for magazine covers, museum displays, motion graphics, and game assets.
One of the more eye-opening aspects of this technology is finding out how many of my peers seemingly have no understanding or respect for the concept of art.
Whole libraries have been written over millennia about the importance and purpose of art, and that specific quote reduced it all down to nothing more than the creation of a product with a specific and mundane function as part of some other product. I genuinely feel bad for people with that mindset towards art.
The argument that programmers are into piracy and therefore should shut up about theft is nonsensical. Not defending piracy, but at least an artist or creator is still credited and their work is unadulterated. Piracy != plagiarism.
It's also ignoring the fact that much plagiarized code is already under permissive licenses. If Star Wars or Daft Punk were CC-BY-SA nobody would need to pirate them, and there may even be a vibrant remix culture... which is kind of the whole point of open source, is it not?
These LLMs don't respect those permissive licenses, though. Especially the GPL, but even MIT requires attribution through inclusion of a copyright notice.
I'll add it's not true for programmers with morality. For instance, people who follow Jesus Christ are commanded to obey the law and treat people right. Many think respecting copyright is both. While I oppose copyright law, I do uphold it and like supporting content creators.
Also, I think there's an argument similar to cryptocurrency companies that run like pyramid schemes. I could've made easy money doing security work for them. Yet, I felt like I'd be participating in helping them rob people or advancing their con. (Some jobs, like building assurance tools, might be OK.) Likewise, using tools built on massive, copyright infringement might be supporting or promoting that.
So, I gotta use legally-trained models or wait for legal reforms that make LLM training legal. Especially the data sets they distribute which is currently illegal, file sharing.
> "For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields."
You've already lost me, because I view programming as an art form. I would no more use AI to generate code than I would use it to paint my canvas.
I think the rest of the article is informative. It made me want to try some things. But it's written from the perspective of a CEO thinking all his developers are just salt miners; miners go into the cave and code comes out.
I think that's actually what my hangup is. It's the old adage of programmers simply "copying and pasting from stack overflow" but taken to the extreme. It's the reduction of my art into mindless labor.
A functional code that is easy to maintain is art (but you have to be an experienced programmer to see it). A shoddy project isn’t, but the whole company feels the pain.
Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.
> Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.
How many furniture makers did you talk to forming this opinion? The metaphor does not line up with either my software of furniture experience. I work with production furniture shops that choose not to use CNCs to avoid the soul being sucked out of the work. This is not a rare stance to take and this is not "japanese joinery" woodworking. This is real work, balancing the means of production with optimal quality. There is all sorts of arguments on whether cncs or using a domino or whatever is "real" woodworking, but the idea that this choice of quality does not exist in woodworking and so we shouldn't have it in software is not my experience.
I don't think that analogy matches very well. Most software is bespoke, the domain requirements, usage aspects, and architectural trade-offs are subtly, or often non-subtly, different each time, and take different trajectories over time. It's not like you're producing the same software 10,000 times, like a piece of furniture. And AI isn't able to produce the exact same thing reproducibly anyway. A better argument would be that AI is actually approaching the craftsmanship/artisanal capabilities.
I'm not sure why you're equivocating "all code" with "CRUD apps". If that were the case I'd be with you. But it isn't. Glue code between APIs is indeed pointless garbage that can be automated, but that's not what I, nor many other engineers, are writing day-to-day.
I'm sure salt miners needed to make peace with their toil and also focused on tools and techniques to be more productive; how to remove the salt most elegantly in nice clean blocks, minimize waste, reduce burden on their physical bodies.
But to their bosses their output was salt.
I'm sorry but unless you're working in open source for the pure love of the tech/craft, the output of software engineering is PROBLEM SOLVING.
That's why "build vs. buy" exists - sometimes it's better to buy a solution than buy one. That's why a valid solution to a problem sometimes is to convince a customer that their ask is wrong or unreasonable, and something simpler or easier would get them 99% of what they need with 1% of the effort.
Sure, for a large portion of our industry, the goal is to hoover up as much user data as cheaply as possible. Being responsible with that data isn't part of that "PROBLEM SOLVING."
You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk), or Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality. I work in aerospace, one of the most paranoid industries, and even we got a Copilot subscription that met our needs...
> You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk)
How does that work exactly? Do you have a link?
> Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality
These companies hoovered up all of our content without notice, permission, or compensation, to train their models. I wouldn't trust them one bit. My personal opinion is that it's foolish to trust them.
> I work in aerospace, one of the most paranoid industries
Who are these friends? Are they in the room with us right now? Look, maybe my experience is atypical but I’m an AI skeptic and I know plenty of others. I’ve never heard people claim that LLMs are a fad or going to go away.
I’ve seen lots of people:
* think that conflating LLMs and “AI” produces a lot of poorly reasoned arguments
* doubt the economic narratives being built around LLM technology
* think the current rate of progress in the technology is basically flat
* think most “AI companies” resemble most crypto companies
An addendum to the last point: very few crypto skeptics deny that BitCoin is a thing or think it’s going away, either. It’s just strawmanning.
As I say, those are the arguments that I actually hear people make. And most of the arguments I'm hearing people make are from people who are using LLM tech on a daily basis (not the agent stuff so much, although it sounds promising). I remain surprised you're encountering so many people quite so skeptical.
The only real dispute about the usefulness of the current technology I hear is the concern that this will leave us without devs who understand the code at a deep level.
Case in point: WWW and the “dot com” bubble. Dot com was a bubble, but the WWW (and Internet) was a big thing that transformed all major societies. It survived the bust that followed after the boom.
Are LLMs here to stay? We can guess, but as usual, time will tell.
It's like the problem of half-full vs half-empty. We see LLM can handle certain tasks, but counter-examples are too far from being rare. So, "LLM can do A" is always followed by "LLM fails at A".
The problem is that LLM never performs consistently. It works when it works. It doesn't when It doesn't. No one knows exactly why, and no one can tell when it's gonna fail. For example, even to this day, GPT sometimes gives me wrong calculations, even when it is instructed to use calculator for that. Who knows why it ignores the instruction, nor why it can't reliably perform the addition of two integers. That really screws up with the automation.
Anyways, I'm really tired of skeptic-skeptics. I hate some people believe "half-full" is genuinely better than "half-empty". I refuse that idea completely. It's more about which context you're in. If you need exactly a cup of water, it's half-empty. If you are lucky to have some water, it's half-full. If you have a clear direction you want to go, you discover a lot of deficiencies. If you're there just enjoying the moment, yeah, you can just keep enjoying the new toy.
> Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades. We’re not East Coast dockworkers; we won’t stop progress on our own.
If you really believe in the power of LLMs then it’s time to wake up. The writing is on the wall. Automation the workforce further into precarious jobs.
The idea that tech workers can’t stop so-called “progress” is at best a dumb self-fulfilling prophecy. Our workplaces depend on us. We have the power to put the brakes on whatever our employers are up to by organizing and striking.
Tech workers should be organizing to prepare for the profit-taking moves management has in store for us as the tech gets better and better. If LLMs really live up to their potential, It’s just going to get worse from here.
They keep bragging about how many people are going to lose their jobs and they mean us. They’re bragging about firing us! It’s foolish for us to sit idly by while we are the ones who make them their profits.
> We have the power to put the brakes on whatever our employers are up to by organizing and striking.
You have the power to do that in your own isolated environment, but it's a delaying tactic at best.
This whole thing is as inevitable as TV, smartphones, airplanes, tanks or guns. Everyone will use them, because they're so much better than anything that came before that there's no competition for the same use cases.
> Tech workers should be organizing to prepare for the profit-taking moves management has in store for us
I think you think this is going to help tech workers, but raising the cost of employing humans is only going to incentivize companies to adopt AI faster.
You should do the opposite, and tax AI use. Though that's probably rife with potential issues too. I think as umemployment increases we're going to have to make a bad decision somewhere... and I'm guessing taxing AI is the lesser evil.
Historically many unions were formed during times of mass automation. Unions don’t unilaterally drive up wages. They increase workers’ negotiating power. This is needed to help workers best navigate the times ahead.
The time for tough decisions is today. There is no interest from the companies that control our politicians in supporting taxes, basic income, or whatever other policy proposal people think will occur after AI. Worker leverage will only be diminished after mass layoffs. We will end up in a situation of social unrest. To pull out of that workers will be forced to organize to increase negotiating power. Sooner is better than later, if we can realize it’s in our interests to do so.
This is the cost of getting used to outrageously high salaries as compared to other engineering fields. Market forces allowed you to inflate your income beyond reason, now the same forces are taking it away. Turns out decoupling your pay from value delivered wasn't such a good idea after all.
If work can meaningfully be done by a machine, why should we be fighting against that? For jobs? If life is all about doing pointless work a machine can do, we've got problems as a species.
The problem is capitalism not LLMs. Fighting against this progress just leaves us stagnant as a species.
Perhaps AI will lead to the revolution where automation frees the average person from having to toil in a factory or a cubical farm instead of trying to achieve infinite growth in a closed system.
I was recently "pair-vibe-coding" with someone who's much 'smarter' than I am, certainly when it comes to coding, especially Python. He's always been in the LLM skeptic camp, and it was fascinating to see that because of his extensive Python knowledge, his prompting was actually very minimal and weak, one might even say 'lazy'. The output he got from o3 for our given task was therefore really mediocre, and had a few hallucinations (which could've been avoided if he spent a few more seconds or minutes prompting).
I, knowing far less than him, would've had a much more elaborate prompt, and o3 would've proved a lot more competent/capable. Yet with my friend, since he knows so much already, and has such a high bar, he thinks the AI should be able to do a lot more with just a few basic words in a prompt... yet, for that same reason, he (understandably) doubts the inevitable sub-par output.
That's what makes all these debates about "Why are smart people doubting LLMs??" so pointless. The smarter you are, the less help you need, so the less prompting you do, the less context the model has, the less impressive the output, and the more the smart person thinks LLMs suck. With this logic, of course the smartest people are also the biggest skeptics!
I doubt this holds true generally. The smart coders I know who are also LLM users generally develop a decent intuition for what the models are good and bad at, and how to steer them into good performance.
Then perhaps my friend has remained a skeptic for so long that he's atrophied in this regard (which OP's post touches on). Either way, most of his day job is as a CTO/manager at a startup, so he's not in the weeds coding as much anymore in the first place. I should've seen how he prompts LLMs for managerial tasks, then I'd know whether his 'prompt laziness' was systemic or influenced by his coding knowledge.
Throwing my two cents in here...I think there's a disconnect between what AI advocates want, and what everyone else wants.
The arguments against genAI tend to point out things like:
1. Its output is unreliable at best
2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes
3. The process automates away a task that many people rely on for income
And the response from genAI advocates tends to be dismissive...and I suspect it is, in part, because that last point is a positive for many advocates of genAI. Nobody wants to say it out loud, but when someone on Reddit or similar claims that even a 10% success rate outweighs the 90% failure rate, what they mean is most likely "A machine that works 10% of the time is better than a programmer who works 60-80% of the time because the machine is more than 6-to-8-times cheaper than the programmer".
There's also the classic line about how automation tends to create more jobs in the future than it destroys now, which itself is a source of big disconnects between pro-genAI and anti-genAI crowds--because it ignores a glaring issue: Just because there's gonna be more jobs in the future, doesn't mean I can pay rent with no job tomorrow!
"You can write an effective coding agent in a week" doesn't reassure people because it doesn't address their concerns. You can't persuade someone that genAI isn't a problem by arguing that you can easily deploy it, because part of the concern is that you can easily deploy it. Also, "you’re not doing what the AI boosters are doing" is flat-out incorrect, at least if you're looking at the same AI boosters I am--most of the people I've seen who claim to be using generated code say they're doing it with Claude, which--to my knowledge--is just an LLM, albeit a particularly advanced one. I won't pretend this is anything but anecdata, but I do engage with people who aren't in the "genAI is evil" camp, and...they use Claude for their programming assistance.
"LLMs can write a large fraction of all the tedious code you’ll ever need to write" further reinforces this disconnect. This is exactly why people think this tech is a problem.
The entire section on "But you have no idea what the code is!" falls apart the moment you consider real-world cases, such as [CVE-2025-4143](https://nvd.nist.gov/vuln/detail/cve-2025-4143), where a programmer who is a self-described expert working with Claude--who emphasizes that he checked over the results with a fine-toothed comb, and that he did this to validate his own skepticism about genAI!--missed a fundamental mistake in implementing OAuth that has been common knowledge for a long while. The author is correct in that reading other people's code is part of the job...but this is difficult enough when the thing that wrote the code can be asked about its methods, and despite advances in giving LLMs a sort of train of thought, the fact remains that LLMs are designed to output things that "look truth-y", not things that are logically consistent. (Ah, but we're not talking about LLMs, even though kentonv tells us that he just used an LLM. We're talking about agentic systems. No true AI booster would "just" use an LLM...)
I actually agree with the point about how the language can catch and point out some of the errors caused by hallucination, but...I can generate bad function signatures just fine on my own, thank you! :P In all seriousness, this addresses basically nothing about the actual point. The problem with hallucination in a setting like this isn't "the AI comes up with a function that doesn't exist", that's what I'm doing when I write code. The problem with hallucination is that sometimes that function which doesn't exist is my RSA implementation, and the AI 'helpfully' writes an RSA implementation for me, a thing that you should never fucking do because cryptography is an incredibly complex thing that's easy to fuck up and hard to audit, and you really ought to just use a library...a thing you [also shouldn't leave up to your AI.](https://www.theregister.com/2025/04/12/ai_code_suggestions_s...) You can't fix that with a language feature, aside from having a really good cryptography library built into the language itself, and as much as I'd love to have a library for literally everything I might want to do in a language...that's not really feasible.
"Does an intern cost $20/month? Because that’s what Cursor.ai costs," says the blog author, as if that's supposed to reassure me. I'm an intern. My primary job responsibility is getting better at programming so I can help with the more advanced things my employer is working on (for the record, these thoughts are my own and not those of my employer). It does not make me happy to know that Cursor.ai can replace me. This also doesn't address the problem that, frankly, large corporations aren't going to replace junior developers with these tools; they're going to replace senior developers, because senior developers cost more. Does a senior engineer cost 20 dollars a month? Because that's what Cursor.ai costs!
...and the claim that open source is just as responsible for taking jobs is baffling. "We used to pay good money for databases" is not an epic own, it is a whole other fucking problem. The people working on FOSS software are in fact very frustrated with the way large corporations use their tools without donating so much as a single red cent! This is a serious problem! You know that XKCD about the whole internet being held up by a project maintained by a single person in his free time? That's what you're complaining about! And that guy would love to be paid to write code that someone can actually fucking audit, but nobody will pay him for it, and instead of recognizing that the guy ought to be supported, you argue that this is proof that nobody else deserves to be supported. I'm trying to steelman this blogpost, I really am, but dude, you fundamentally have this point backwards.
I hope this helps others understand why this blogpost doesn't actually address any of my concerns, or the concerns of other people I know. That's kind of the best I can hope for here.
> 2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes
The thing is this is true of humans too.
I review a lot of human code. I could easily imagine a junior engineer creating CVE-2025-4143. I've seen worse.
Would that bug have happened if I had written the code myself? Not sure, I'd like to think "no", but the point is moot anyway: I would not have personally been the one to write that code by hand. It likely would have gone to someone more junior on the team, and I would have reviewed their code, and I might have forgotten to check for this all the same.
In short, whether it's humans or AI writing the code, it was my job to have reviewed the code carefully, and unfortunately I missed here. That's really entirely on me. (It's particularly frustrating for me as this particular bug was on my list of things to check for and somehow I didn't.)
> 3. The process automates away a task that many people rely on for income
At Cloudflare, at least, we always have 10x more stuff we want to work on then we have engineers to work on it. The number of engineers we can hire is basically dictated by revenue. If each engineer is more productive, though, then we can ship features faster, which hopefully leads to revenue growing faster. Which means we hire more engineers.
I realize this is not going to be true everywhere, but in my particular case, I'm confident saying that my use of AI did not cause any loss of income for human engineers, and likely actually increased it.
I mean, fair. It's true that humans aren't that great at writing code that can't be exploited, and the blogpost makes this point too: between a junior engineer's output and an LLM's output, the LLM does the same thing for cheaper.
I would argue that a junior engineer has a more valuable feature--the ability to ask that junior engineer questions after the fact, and ideally the ability to learn and eventually become a senior engineer--but if you're looking at just the cost of a junior engineer doing junior engineer things...yeah, no, the LLM does it more efficiently. If you assume that the goal is to write code cheaper, LLMs win.
However, I'd like to point out--again--that this isn't going to be used to replace junior engineers, it's going to be used to replace senior engineers. Senior engineers cost more than junior engineers; if you want each engineer to be more productive per-dollar (and assume, like many shareholders do, that software engineers are fungible) then the smart thing to do is replace the more costly engineer. After all, the whole point of AI is to be smart enough to automate things, right?
You and I understand that a senior engineer's job is very different from a junior engineer's job, but a stockholder doesn't--because a stockholder only needs to know how finance works to be a successful stockholder. Furthermore, the stockholder's goal is simply to make as much money as possible per quarter--partly because he can just walk out if the company starts going under, often with a bigger "severance package" than any of the engineers in the company. The incentives are lined up not only for the stockholder to not know why getting rid of senior engineers is a bad idea, but to not care. Were I in your position, I would be worried about losing my job, not because I didn't catch the issue, but because
Aside: Honestly, I don't really blame you for getting caught out by that bug. I'm by no means an expert on anything to do with OAuth, but it looks like the kind of thing that's a nightmare to catch, because it's misbehavior under the kind of conditions that are--well, only seen when maliciously crafted. If it wasn't something that was known about since the RFC, it would probably have taken a lot longer for someone to find it.
It's interesting also how these takes consistently ignore spectacularly the environmental cost of these as well.
I'm a bit on the fence myself, as I think it's very harmful, but I can also see ways it can be useful. But it's absolutely mindblowing how this is nearly always completely out of the discussion even though our current way of living and powering things is on a timer and we still haven't addressed it as a whole.
the “environmental costs” argument is based upon like fifteen false assumptions. by 2030, AI will be running almost entirely on solar+wind+battery, because that is by far the cheapest option
As someone who thinks modern copyright law is unfit for purpose, and more specifically that software shouldn't be copyrightable at all, my main complaint about LLMs is the same as Chinese spying and IP theft: I only care because they don't share.
I'm perfectly fine using open-weights LLMs[0] to do things LLMs are good at. But I would never trust using OpenAI or Perplexity's SaaSS models as a replacement for my own skills. Half the reason they put things behind a server is so they can spy on you[1], the other half is so they can promise a revenue stream to the people they plan to sell the company to so they can enshittify it. The only enshittification-resistant software is software people can't charge you for.
All the other things this blog post is complaining about are real reactions to real harms, even in the FOSS community. People use LLMs as skilled labor substitute and get burned, while creating a spam problem for everyone else. In fact, that's the real replacement threat for artists: diffusion models don't so much outperform human artists as much as they convince people not to pay for art.
[0] I don't give two flying fucks about the license terms.
[1] The judge in the NYT lawsuit is even ordering spying for evidentiary reasons
We are beyond the point of trying to convince naysayers.
I will simply not hire anybody who is not good at using LLMs, and I don't think I would ever work with anybody who thinks they aren't very useful. It's like working with somebody who things compilers are useless. Obviously wrong, not worth spending time trying to convince.
To anyone who reads this article and disagrees with the central point: You are missing the most important thing that will happen in your career. You should reevaluate because you will be unemployable in a few years.
I don't think most people with mixed feelings in LLMs (or heretic naysayers as you put it) would want to work in a place like that, so perhaps you are doing everyone a favour!
I think this is a reasonable response. But I also think it's worth taking the parent's compiler analogy seriously as a thought experiment.
Back when I was in college in the 00s, if I had developed a preference for not using compilers in my work, I might have been able to build a career that way, but my options would have been significantly limited. And that's not because people were just jerks who were biased against compiler skeptics, or evil executives squeezing the bottom line, or whatever. It's because the kind of software most people were making at that period of time would have been untenable to create without higher level languages.
In my view, we clearly aren't at this point yet with llm-based tooling, and maybe we never will be. But it seems a lot more plausible to me that we will than it did a year or even six months ago.
It reminds me of many of the people I worked with early in my career.
They were opposed to C++ (they thought C was all you need), opposed to git (they used IBM clearcase or subversion), opposed to putting internal tools in a web browser (why not use Qt and install the tool), opposed to using python or javascript for web services (it's just a script kiddie language), opposed to sublime text/pycharm/vscode (IDEs are for people who don't know how to use a CLI).
I have encountered it over and over, and each time these people get stuck in late career jobs making less than 1/3 of what most 23 year old SWEs I know are making.
I apologize. I genuinely did not mean snark, nor to be dismissive. It was a sincere honest-to-god "oh wow this actually explains the situation over there" mixed with an expression of frustration with their service quality.
A lot of their pages barely work and feel like first drafts, much of the UI is even labeled as a work in progress, and manytools regularly just fail to do what they're supposed to.
Deployments are routinely broken, with Twitter lit up like a light bulb about it while their status page remains all services operational.
I deploy three completely separate applications with Fly.io and have regular unscheduled downtime.
I stick with them despite all this because they're cheap and easily scale regular Go apps down to zero which is wonderful for apps that see intermittent/irregular usage patterns.
The fact that they are adamant defenders of vibe coding going so far in the post as to say "who cares about hallucinations" genuinely explains a fair deal about their unprofessional feeling service.
This reads even more like an angry teenager than my angsty high school diary. I'm not sure how many more strawmans and dismissive remarks I can handle in one article.
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."
Is there a similar rule for submissions, or are submitters exempt from adopting HN culture? "Please don't submit shallow arguments, specially criticizing other people's work?"
Because we've recently been getting a series of low quality submissions which obviously drive low quality discussion, but for some reason it's only the commenters who get reprimanded, as if the submissions weren't ruining the mood one by one.
(And to clarify, I haven't been warned, I'm not writing this out of spite.)
To your first point, I think it’s just a distinction our industry hasn’t figured out yet. There are some stages in projects that some people just don’t like. Similar to how you have rough carpentry vs finish. Some like building the structure, others like the details and polish.
> Interestingly, I've noticed that a lot of the people I meet that both find coding tedious and are extremely excited by AI coding, are generally over 40.
I don't think it's the coding that they find tedious.
I think you missed the part that's not exciting. When you write your 50th webapp and start integrating auth flows once again that just work slightly differently in this new app than your older one. That's what boils the blood, not writing new business logic.
Right, I guess that's what I mean I've side-stepped so far. I don't generally work on anything web related and haven't had to repeatedly work on similar projects over and over again. I've maybe just misunderstood what most other developers actually work on in the industry.
I love coding. I’m not excited by AI coding but I am curious. I find it useful in quite limited ways.
I have recently started an Ansible project where Aider helped me get off the ground and expand a little bit. Then it made a mess out of it so some other day I’ll have to find the time to start over.
It was useful though, and I’ll use some of the bits created by the LLM to kickstart it in anger next time around.
I am pretty exhausted with reading the cynical takes on AI lately. Like, I get it, but ugh.
Yes, the tech isn't perfect yet. Yes, it means existing industries will be displaced. Yes, it threatens to undermine the identities we have built as useful artists, programmers, data analysts, whatever. It is a scary, uncertain future, and on some days I am overcome with dread about what it means for me.
But ffs, we will figure it out. This technology, if it continues to improve at the pace it has, will open the door to us solving previously intractable problems - problems like curing cancer, endless energy and nuclear fusion, making us 100x more productive at building great housing and infrastructure. Let's get on board and make this thing a success, not fight it every step of the way because we want to preserve some basically arbitrary state of society.
A writing for the ages. I've found most of the LLM skeptics are either being hypocritical or just being gate-keepy (we dont want everyone to write code)
I find the AI proponents have an insane level of egocentrism
They cannot possibly imagine someone has a different use case where the AI didn't work
"I crank out shitty webapps all day, therefore every single other dev does. Everyone obviously has the same use case as me because I am the center of the universe"
I was a 3-4x programmer before. Now I’m a 9-15x programmer when wrangling LLMs.
This is a sea change and it’s already into “incredible” territory and shows no signs of slowing down.
> Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.
I have been banging out little projects that I have wanted to exist for years but always had on the back burner. Write a detailed readme and ask the agent to interrogate you about the missing parts of the spec then update the README. Then have it make a TODO and start implementing. Give it another code base for style guide.
I’ve made more good and useful and working code in the last month than I have in the last two years.
I can’t tell if you’re being sarcastic or not, but if you are, the real world is not far behind. I can imagine a world where a mixture of AI agents (some doing hypercritical code review) can return you tested and idiomatic PRs faster than you can describe the new architecture in issues.
I think a lot of people are unfamiliar with the (expensive) SOTA.
I'm not sure about giving specific metrics or kpis of efficiency or performance
It definitely feels different to develop using LLMs, especially things from scratch. At this point, you can't just have the LLM do everything. Sooner or later you need to start intervening more often, and as the complexity of the project grows, so does the attention you need to give to guiding the LLM. At that point the main gains are mostly in typing and quickly looking some things up, which are still really nice gains
I hate how the discourse around LLM-assisted programing is so polarized. It's either detractors saying it's "a fad that's useless and going nowhere, wasting billions of megawatts every year" or it's true believers calling it "the most miraculous sea change technological advancement in my lifetime" or "more important than fire and electricity[1]." There just doesn't seem to be any room in the middle.
I tried out Copilot a few months back to see what all the fuss was about and so that I could credibly engage with discussions having actually used the technology. I'd rate it as "kind of neat-o" but not earth shattering. It was like the first time I used an IDE with auto-complete. Oh, cool, nice feature. Would I pay monthly for it? No way. Would I integrate it into my development workflow if it were free? Maybe, I guess? Probably wouldn't bother unless it came literally set up for me out of the box like autocomplete does nowadays.
Don't get me wrong--it's cool technology. Well done, AI people. Is it "the 2nd most important thing to happen over the course of my career" as OP wrote? Come on, let's come down to earth a little.
I don't know. I think 9-12 months ago I'd agree with you. But I feel like the last 6 months my productivity has vastly improved. Not only that, it's also brought back a little bit of passion for the field.
It's easy to come up with some good ideas for new project, but then not want to do a lot of the garbage work related to the project. I offload all that shit to the LLM now.
Seriously, the LLMs have increased my productivity 2-4x.
Copilot is a bad yardstick. The article literally addresses exactly this. It’s not just “cool technology”, that’s the point. It enables things that were previously impossible.
I spent $600 on claude via cursor last month and it was easily worth 2-3x that.
It means cranking out hello world even faster i guess. I wonder how complex all these projects are people are proud to have completed with the help of AI.
It depends on the value of x. I think it's safe to assume x <= 0.75, else they'd contribute negatively to their teams (happens from time to time, but let's be generous). Previously they'd be anywhere from a 0/10 to 3/10 programmer, and now they get up to 9/10 on a good day but sometimes are a net negative, as low as -2.25/10 on a bad day. I imagine that happens when tired or distracted and unable to adequately police LLM output.
It’s a riff on the “10x programmer” concept. People who haven’t worked with 10x programmers tend to not believe they exist.
I’m nowhere near that, but even unaided I’m quite a bit faster than most people I’ve hired or worked with. With LLMs my high quality output has easily tripled.
Writing code may be easier than reading it - but reading it is FASTER than writing it. And that’s what matters.
I think this article is pretty spot on — it articulates something I’ve come to appreciate about LLM-assisted coding over the past few months.
I started out very sceptical. When Claude Code landed, I got completely seduced — borderline addicted, slot machine-style — by what initially felt like a superpower. Then I actually read the code. It was shockingly bad. I swung back hard to my earlier scepticism, probably even more entrenched than before.
Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.
It’s still absolute rubbish if you just let it run wild, which is why I think “vibe coding” is basically just “vibe debt” — because it just doesn’t do what most (possibly uninformed) people think it does.
But if you treat it as a collaborator — more like an idiot savant with a massive brain but no instinct or nous — or better yet, as a mech suit [0] that needs firm control — then something interesting happens.
I’m now at a point where working with Claude Code is not just productive, it actually produces pretty good code, with the right guidance. I’ve got tests, lots of them. I’ve also developed a way of getting Claude to document intent as we go, which helps me, any future human reader, and, crucially, the model itself when revisiting old code.
What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.
Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”? There’s no counterfactual, so who knows? But change was inevitable.
What’s clear is this tech is here now, and complaining about it feels a bit like mourning the loss of punch cards when terminals showed up.
[0]: https://matthewsinclair.com/blog/0178-why-llm-powered-progra...
One of the things I think is going on here is a sort of stone soup effect. [1]
Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.
But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.
Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.
So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/
[1] https://en.wikipedia.org/wiki/Stone_Soup
> Core to Ptacek's point is that everything has changed in the last 6 months.
This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.
I wrote this comment elsewhere: https://news.ycombinator.com/item?id=44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.
23 replies →
Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.
Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.
Synthesis is hard.
Understanding causality is even harder.
Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.
This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)
Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.
Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.
5 replies →
I’m an amateur coder and I used to rely on Cursor a lot to code when I was actively working on hobby apps about 6 months ago
I picked coding again a couple of days back and I’m blown away by how much things have changed
It was all manual work until a few months back. Suddenly, its all agents
4 replies →
"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."
You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.
I wish i could impress this upon more people.
A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.
This is maybe even true!
as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.
Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".
As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.
So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.
and like i said, this is not just true in CS, but in lots of fields.
That is arguably quite bad, but also seems unlikely to change.
2 replies →
The better I am at solving a problem, the less I use AI assistants. I use them if I try a new language or framework.
Busy code I need to generate is difficult to do with AI too. Because then you need to formalize the necessary context for an AI assistant, which is exhausting with an unsure result. So perhaps it is just simpler to write it yourself quickly.
I understand comments being negative, because there is so much AI hype without having to many practical applications yet. Or at least good practical applications. Some of that hype is justified, some of it is not. I enjoyed the image/video/audio synthesis hype more tbh.
Test cases are quite helpful and comments are decent too. But often prompting is more complex than programming something. And you can never be sure if any answer is usable.
> But often prompting is more complex than programming something.
I'd challenge this one; is it more complex, or is all the thinking and decision making concentrated into a single sentence or paragraph? For me, programming something is taking a big high over problem and breaking it down into smaller and smaller sections until it's a line of code; the lines of code are relatively low effort / cost little brain power. But in my experience, the problem itself and its nuances are only defined once all code is written. If you have to prompt an AI to write it, you need to define the problem beforehand.
It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source. Techniques like TDD have shifted more of the problem definition forwards as you have to think about your desired outcomes before writing code, but I'm pretty sure (I have no figures) it's only a minority of developers that have the self-discipline to practice test-driven development consistently.
(disclaimer: I don't use AI much, and my employer isn't yet looking into or paying for agentic coding, so it's chat style or inline code suggestions)
19 replies →
I agree with your points but I'm also reminded of one my bigger learnings as a manager - the stuff I'm best at is the hardest, but most important, to delegate.
Sure it was easier to do it myself. But putting in the time to train, give context, develop guardrails, learn how to monitor etc ultimately taught me the skills needed to delegate effectively and multiply the teams output massively as we added people.
It's early days but I'm getting the same feeling with LLMs. It's as exhausting as training an overconfident but talented intern, but if you can work through it and somehow get it to produce something as good as you would do yourself, it's a massive multiplier.
15 replies →
> But often prompting is more complex than programming something. It may be more complex, but it is in my opinion better long term. We need to get good at communicating with AIs to get results that we want. Forgive me assuming that you probably didn't use these assistants long enough to get good at using them. I'm web developer for 20 years already and AI tools are multiplying my output even in problems I'm very good at. And they are getting better very quickly.
1 reply →
> The better I am at solving a problem, the less I use AI assistants.
Yes, but you're expensive.
And these models are getting better at solving a lot of business-relevant problems.
Soon all business-relevant problems will be bent to the shape of the LLM because it's cost-effective.
5 replies →
I totally agree. The ”hard to control mech suit” is an excellent analogy.
When it works it’s brilliant.
There is a threshold point as part of the learning curve where you realize you are in a pile of spaghetti code and think it actually saves no time to use LLM assistant.
But then you learn to avoid the bad parts - thus they don’t take your time anymore - and the good parts start paying back in heaps of the time spent learning.
They are not zero effort tools.
There is a non-trivial learning cost involved.
The issue is we’re too early in the process to even have a solid education program for using LLMs. I use them all the time and continue to struggle finding an approach that works well. It’s easy to use them for documentation look up. Or filling in boilerplate. Sometimes they nail a transformation/translation task, other times they’re more trouble than they’re worth.
We need to understand what kind of guard rails to put these models on for optimal results.
6 replies →
also, the agents are actually pretty good at cleaning up spaghetti if you do it one module at a time, use unit tests. And some of the models are smart enough to suggest good organization schemes!
For what it's worth: I'm not dismissive of the idea that these things could be ruinous for the interests of the profession. I don't automatically assume that making applications drastically easier to produce is just going to make way for more opportunities.
I just don't think the interest of the profession control. The travel agents had interests too!
For a long time there has been back chatter on how to turn programming into a more professional field, more like actual engineering where when something goes wrong actual people and companies start to take security seriously, and get held accountable for their mistakes, and start to actually earn their high salaries.
Getting AI to hallucinate its way into secure and better quality code seems like the antithesis of this. Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars - instead of attempting to take on our more critical and creative outputs - image generation, movie generation, book writing and even website building.
4 replies →
Soon as the world realized they don't need a website and can just have FB/Twitter page, a huge percentage of freelance web development gigs just vanished. We have to get real about what's about to happen. The app economy filled the gap, and the only optimistic case is the AI app industry is what's going to fill the gap going forward. I just don't know about that. There's a certain end-game vibes I'm getting because we're talking about self-building and self-healing software. More so, a person can ask the AI to role play anything, even an app.
6 replies →
It's kind of ironic to me that this is so often the example trotted out. Look at the BLS data sheet for job outlook: https://www.bls.gov/ooh/sales/travel-agents.htm#tab-6
> Employment of travel agents is projected to grow 3 percent from 2023 to 2033, about as fast as the average for all occupations.
The last year there is data for claims 68,800 people employed as travel agents in the US. It's not a boom industry by any means, but it doesn't appear they experienced the apocalypse that Hacker News believes they did, either.
I don't know how to easily find historical data, unfortunately. BLS publishes the excel sheets, but pulling out the specific category would have to be done manually as far as I can tell. There's this, I guess: https://www.travelagewest.com/Industry-Insight/Business-Feat...
It appears at least that what happened is, though it may be easier than ever to plan your own travel, there are so many more people traveling these days than in the past that the demand for travel agents hasn't crashed.
3 replies →
Let's be real. Software engineers are skeptical right now not because they believe robots are better than them. Quite the opposite. The suits will replace software engineers despite its mediocrity.
It was just 2 weeks ago when the utter incompetence of these robots were in full public display [1]. But none of that will matter to greedy corporate executives, who will prioritize short-term cost savings. They will hop from company to company, personally reaping the benefits while undermining essential systems that users and society rely on with robot slop. That's part of the reason why the C-suites are overhyping the technology. After all, no rich executive has faced consequences for behaving this way.
It's not just software engineering jobs that will take a hit. Society as a whole will suffer from the greedy recklessness.
[1]: https://news.ycombinator.com/item?id=44050152
The reason I remain in the "skeptical" camp is because I am experiencing the same thing you are - I keep oscillating between being impressed, then disappointed.
Ultimately the thing that impresses me is that LLMs have replaced google search. The thing that disappoints me is that their code is often convincing but wrong.
Coming from a hard-engineering background, anything that is unreliable is categorized as bad. If you come from the move-fast-break-things world of tech, then your tolerance for mistakes is probably a lot higher.
This is a bit tangential, but isn't that partly because google search keeps evolving into a worse resource due to the SEO garbage race?
3 replies →
I have been using Windsurf for a few months and ChatGPT for a couple of years. I don't feel Windsurf is a massive game changer personally. It is good if you are very tired or working in a new area (also good for exploring UI ideas as the feedback loop is tight), but still not a real game changer over ChatGPT. Waiting around for it to do its thing ("we've encountered at error - no credits used") is boring and flow destroying. Of you know exactly what you are doing the productivity is probably 0.5 vs just typing the code in yourself. Sorry, I'm not going to bang around in Windsurf all day just to help with the training so that "v2" can be better. They should be paying me for this realistically.
Of course, in aggregate AI makes me capable in a far broader set of problem domains. It would be tough to live without it at this stage, but needs to be used for what it is actually good at, not what we hope it will be good at.
Have you tried Cursor or Zed? I find they’re both significantly better in their “agent” modes than Windsurf.
1 reply →
> What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.
I tried the latest Claude for a very complex wrapper around the AWS Price APIs who are not easy to work with. Down a 2,000 line of code file, I found Claude faking some API returns by creating hard coded values. A pattern I have seen professional developers being caught on while under pressure to deliver.
This will be a boon to the human skilled developers, that will be hired at $900 dollars an hour to fix bugs of a subtlety never seen before.
More or less this. Maybe a job opportunity, but many decision makers won't see the real problem until they get hit by that AWS bill. Ironic, if the business won't hire you because they went out of business?
I mean, that bug doesnt seem very subtle.
2 replies →
The arguments seem to come down to tooling. The article suggests that ChatGPT isn't a good way to interact with LLMs but I'm not so sure. If the greatest utility is "rubber ducking" and editing the code yourself is necessary then tools like Cursor go too far in a sense. In my own experience, Windsurf is good for true vibe coding where I just want to explore an idea and throw away the code. It is still annoying though as it takes so long to do things - ruining any kind of flow state you may have. I am conversing with ChatGPT directly much more often.
I haven't tried Claud code yet however. Maybe that approach is more on point.
Totally agree with "vibe debt". Letting an LLM off-leash without checks is a fast track to spaghetti. But with tests, clear prompts, and some light editing, I’ve shipped a lot of real stuff faster than I could have otherwise.
I generally agree with the attitude of the original post as well. But I stick one one point. It definitely doesn't cost 20 dollars a month, cursor.ai might and I don't know how good it is, but claude code costs hundreds of dollars a month, still cheaper than a junior dev though.
> Did Photoshop kill graphic artists? Did film kill theatre?
To a first approximation, the answer to both of these is "yes".
There is still a lot of graphic design work out there (though generative AI will be sucking the marrow out of it soon), but far less than there used to be before the desktop publishing revolution. And the kind of work changed. If "graphic design" to you meant sitting at a drafting table with pencil and paper, those jobs largely evaporated. If that was a kind of work that was rewarding and meaningful to you, that option was removed for you.
Theatre even more so. Yes, there are still some theatres. But the number of people who get to work in theatrical acting, set design, costuming, etc. is a tiny tiny fraction of what it used to be. And those people are barely scraping together a living, and usually working side jobs just to pay their bills.
> it feels a bit like mourning the loss of punch cards when terminals showed up.
I think people deserve the right to mourn the loss of experiences that are meaningful and enjoyable to them, even if those experiences turn out to no longer be maximally economically efficient according to the Great Capitalistic Moral Code.
Does it mean that we should preserve antiquated jobs and suffer the societal effects of inefficiency without bound? Probably not.
But we should remember that the ultimate goal of the economic system is to enable people to live with meaning and dignity. Efficiency is a means to that end.
But the number of people who get to work in theatrical acting, set design, costuming
I think this ends up being recency bias and terminology hairsplitting, in the end. The number of people working in theatre mask design went to nearly zero quite a while back but we still call the stuff in the centuries after that 'theatre' and 'acting'.
9 replies →
> Did Photoshop kill graphic artists?
No, but AI did.
In actual fact, photoshop did kill graphic arts. There was an entire industry filled with people who had highly-developed skillsets that suddenly became obsolete. Painters for example. Before photoshop, I had to go out of house to get artwork done; now I just do it myself.
3 replies →
This, as the article makes clear, is a concern I am alert and receptive to. Ban production of anything visual from an LLM; I'll vote for it. Just make sure they can still generate Mermaid charts and Graphviz diagrams, so they still apply to developers.
47 replies →
Well, this is only partially true. My optimistic take is that it will redefine the field. There is still a future for resourceful, attentive, and prepared graphic artists.
AI didn't kill creativity nor intuition. It much rather lack's those things completely. Artists can make use of AI but they can't make themselves obsolete just yet.
11 replies →
It will not.
I'm an engineer through and through. I can ask an LLM to generate images just fine, but for a given target audience for a certain purpose? I would have no clue. None what so ever. Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start. I have absolutely no understanding of the advertisement domain, and I don't know what tired parents find visually pleasing, or what they would "vibe" with.
My feeble attempts would be absolute trash compared to a professional artist who uses AI to express their vision. The artist would be able to prompt so much more effectively and correct the things that they know from experience will not work.
It's the exact same as with coding with an AI - it will be trash unless you understand the hows and the whys.
2 replies →
I agree with the potential of AI. I use it daily for coding and other tasks. However, there are two fundamental issues that make this different from the Photoshop comparison.
The models are trained primarily on copyrighted material and code written by the very professionals who now must "upskill" to remain relevant. This raises complex questions about compensation and ownership that didn't exist with traditional tools. Even if current laws permit it, the ethical implications are different from Photoshop-like tools.
Previous innovations created new mediums and opportunities. Photoshop didn't replace artists, because it enabled new art forms. Film reduced theater jobs but created an entirely new industry where skills could mostly transfer. Manufacturing automation made products like cars accessible to everyone.
AI is fundamentally different. It's designed to produce identical output to human workers, just more cheaply and/or faster. Instead of creating new possibilities, it's primarily focused on substitution. Say AI could eliminate 20% of coding jobs and reduce wages by 30%:
The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.
This isn't an argument against progress, but we should be clear-eyed about how this transition differs from historical parallels, and why it might not repeat the same historical outcomes. I'm not claiming this will be the case, but that you can see some pretty significant differences for why you might be skeptical that the same creation of new jobs, or improvement to human lifestyle/capabilities will emerge as with say Film or Photoshop.
AI can also be used to achieve things we could not do without, that's the good use of AI, things like Cancer detection, self-driving cars, and so on. I'm speaking specifically of the use of AI to automate and reduce the cost/speed of white collar work like software development.
For me this is the "issue" I have with AI. Unlike say the internet, mobile and other tech revolutions where I could see new use cases or existing use case optimisation spring up all the time (new apps, new ways of interacting, more efficient than physical systems, etc) AI seems to be focused more on efficiency/substitution of labour than pushing the frontier on "quality of life". Maybe this will change but the buzz is around job replacement atm.
Its why it is impacting so many people, but also having very small changes to everyday "quality of life" kind of metrics (e.g. ability to eat, communicate, live somewhere, etc). It arguably is more about enabling greater inequality and gatekeeping of wealth to capital - where intelligence and merit matters less in the future world. For most people its hard to see where the positives are for them long term in this story; most everyday folks don't believe the utopia story is in anyway probable.
> The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.
This is the thing that worries me the most about AI.
The author's ramblings dovetails with this a bit in their "but the craft" section. They vaguely attack the idea of code-golfing and focusing on coding for the craft as essentially incompatible with the corporate model of programming work. And perhaps they're right. If they are, though, this AI wave/hype being mostly about process-streamlining and such seems to be a distillation of that fact.
Maybe it's like automation that makes webdev accessible to anyone. You take a week long AI coaching course and talk to an AI and let it throw together a website in an hour, then you self host it.
The key is that manual coding for a normal task takes a one/two weeks, where-as if you configure all your prompts/agents correctly you could do it in a couple of hours. As you highlighted, it brings many new issues (code quality, lack of tests, tech debt) and you need to carefully create prompts and review the code to tackle those. But in the end, you can save significant time.
I disagree. I think this notion comes from the idea that creating software is about coding. Automating/improving coding => you have software at the end.
This might be how one looks at it in the beginning, when having no experience or no idea about coding. With time one will realize it's more about creating the correct mental model of the problem at hand, rather than the activity of coding itself.
Once this realized, AI can't "save" you days of work, as coding is the least time consuming part of creating software.
11 replies →
Would you have any standard prompts you could share which ask it to make a draft with you'd want (eg unit tests etc)?
4 replies →
The question is can I self-host this "mech suit"? If not, I would much not use some API hosted by another party.
Saas just seems very much like a terminator seed situation in the end.
"Mech suit" is apt. Gonna use that now.
Having plenty of initial discussion and distilling that into requirements documents aimed for modularized components which can all be easily tackled separately is key.
This is my experience as well.
I’d add that Excel didn’t kill the engineering field. It made them more effective and maybe companies will need less of them. But it also means more startups and smaller shops can make use of an engineer. The change is hard and an equilibrium will be reached.
> Did Photoshop kill graphic artists?
Desktop publication software killed many jobs. I worked for a publication where I had colleagues that used to typeset, place images, and use a camera to build pages by hand. That required a team of people. Once Quark Xpress and the like hit the scene, one person could do it all, faster.
In terms of illustration, the tools moved from pen and paper to Adobe Illustrator and Aldus / Macromedia Freehand. Which I'd argue was more of a sideways move. You still needed an illustrators skillset to use these tools.
The difference between what I just described and LLM image generation is the tooling changed to streamline an existing skillset. LLM's replace all of it. Just type something and here's your picture. No art / design skill necessary. Obviously, there's no guarantee that the LLM generated image will be any good. So, I'm not sure the Photoshop analogy works here.
> Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.
This is how I use it mostly. I also use it for boilerplate, like "What would a database model look like that handles the following" you never want it to do everything, though there are tools that can and will and they're impressive, but then when you have a true production issue, your inability to quickly respond will be a barrier.
That’s all great news that if you know how to use an LLM, it works wonders for you. But LLMs are changing so fast, can it really be sustainable for me to “learn” it only for it to change and go backwards the next month? (I am thinking about how terrible Google became.)
I’m learning live how to use these things better, and I haven’t seen practical guides like:
- Split things into small files, today’s model harnesses struggle with massive files
- Write lots of tests. When the language model messes up the code (it will), it can use the tests to climb out. Tests are the best way to communicate behavior.
- Write guides and documentation for complex tasks in complex codebases. Use a language model for the first pass if you’re too lazy. Useful for both humans and LLMs
It’s really: make your codebase welcoming for junior engineers
> it can use the tests to climb out
Or not. I watched Copilot's agent mode get stuck in a loop for most of an hour (to be fair, I was letting it continue to see how it handles this failure case) trying to make a test pass.
1 reply →
> Did film kill theatre?
Relatively speaking, I would say that film and TV did kill theater
Yes! It needs and seems to want the human to be a deep collaborator. If you take that approach, it is actually a second senior developer you can work with. You need to push it, and explain the complexities in detail to get fuller rewards. And get it to document everything important it learns from each session's context. It wants to collaborate to make you a 10X coder, not to do your work for you while you laze. That is the biggest breakthrough I have found. They basically react like human brains, with the same kind of motives. Their output can vary dramatically based on the input you provide.
i love your views and way to express it, spot on. i feel similar in some ways. i hated ai, loved ai, hated it again and love it again. i still feel the code i unusable for my main problems, but i realize better its my arrogance that causes it. i cant formulate solutions eloquently enough and blame the AI for bad code.
AI has helped me pick up my pencil and paper again and realize my flawed knowledge, skills, and even flawed approach to AI.
Now i instructed it to never give me code :). not because the code is bad, but my attempts to extract code from it are more based in laziness than efficiency. they are easy to confuse afterall ;(....
I have tons of fun learning with AI, exploring. going on adventures into new topics. Then when i want to really do something, i try to use it for the things i know i am bad at due to laziness, not lack of knowledge. the thing i fell for first...
it helps me explore a space, then i think or am inspired for some creation, and it helps me structure and plan. when i ask it from laziness to give me the code, it helps me overcome my laziness by explaining what i need to do to be able to see why asking for the code was the wrong approach in the first place.
now, that might be different for you. but i have learned i am not some god tier hacker from the spawl, so i realized i need to learn and get better. perhaps you are at the level you can ask it for code and it just works. hats off in that case ;k (i do hope you tested well!)
Photoshop etc are still just tools. They can’t beat us at what has always set us apart: thinking. LLM’s are the closest, and while they’re not close they’re directionally correct. They’re general purpose, not like chess engines. And they improve. It’s hard to predict a year out, never mind ten.
I agree, this article is basically what I've been thinking as I play with these things over time. They've gotten a ton better but the hot takes are still from 6-12 months ago.
One thing I wish he would have talked about though is maintenance. My only real qualm with my LLM agent buddy is the tendency to just keep adding code if the first pass didn't work. Eventually, it works, sometimes with my manual help. But the resulting code is harder to read and reason about, which makes maintenance and adding features or behavior changes harder. Until you're ready to just hand off the code to the LLM and not do your own changes to it, it's definitely something to keep in mind at minimum.
> Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”?
My obligatory comment how analogies are not good for arguments: there is already discussion here that film (etc.) may have killed theatre.
I am pretty sure this comment is also AI generated. Just a guess but so many em-dash is suspicious. And the overall structure of convincing feels uncanny.
If this is true, can you share your initial draft that you asked the AI to rewrite. Am I not right that the initial draft is more concise and better conveys your actual thought, even though it's not as much convincing.
Definitely. So many people taken in by it!
I think also the key is - don't call it AI, because it's not. It's LLM assist query parsing and code generation. Semantically, if you call it AI, the public expects a cognitive equivalent to a human which this is not, and from what @tptacek describes, is not meant to be - the reasoning and other code bits to create agents and such seem to be developed specifically for code generation and programming assist and other tasks thereof. Viewed in that lens, the article is correct - it is by all means a major step forward.
I agree but that battle is lost. Someone was calling Zapier workflows AI on X.
AGI vs AI is how to separate this these days.
The irony of the ChatGPT em dashes ;3
The entire comment feels way too long, structured and convincing in a way that can only be written by an AI. I just hope that once the em-dashes are "fixed", we still be able to detect such text. I fear for a future when human text is sparse, even here at HN. It is depressing to see such a comment take the top spot.
1 reply →
I use LLMs daily. From helping me write technical reports (not 100%, mostly making things sound better after I have a first draft) to mapping APIs (documentation, etc).
I can only imagine what this technology will be like in 10 years. But I do know that it's not going anywhere and it's best to get familiar with it now.
I treat AI as my digital partner in pair programming. I've learned how to give it specific and well-defined tasks to do, and it gets it done. The narrower the scope and more specific the task then the more successful you'll have.
there’s a sweet spot in there, it’s not “as narrow as possible” - the most productive thing is to assign the largest possible tasks that are just short of the limit where the agents become stupid. this is hard to hit, and a moving target!
1 reply →
Love all of this.
Most importantly, I'll embrace the change and hope for the possible abundance.
LLM's are self-limiting, rather than self-reinforcing, and that's the big reason why they're not the thing, both good or bad, that some people think they are.
"Garbage in, garbage out", is still the rule for LLM's. If you don't spend billions training them or if you let them feed on their own tail too much they produce nonsense. e.g. Some LLM's currently produce better general search results than google. This is mainly a product of many billions being spent on expert trainers for those LLM's, while google neglects (or actively enshitifies) their search algorithms shamefully. It's humans, not LLM's, producing these results. How good will LLM's be at search once the money has moved somewhere else and neglect sets in?
LLM's aren't going to take everyone's jobs and trigger a singularity precisely because they fall apart if they try to feed on their own output. They need human input at every stage. They are going to take some people's jobs and create new ones for others, although it will probably be more of the former than the latter, or billionaires wouldn't be betting on them.
Yes, film killed theatre.
> Then I actually read the code.
This is my experience in general. People seem to be impressed by the LLM output until they actually comprehend it.
The fastest way to have someone break out of this illusion is tell them to chat with the LLM about their own expertise. They will quickly start to notice errors in the output.
You know who does that also? Humans. I read shitty, broken, amazing, useful code every day, but you don’t see my complaining online that people who earn 100-200k salary don’t produce ideal output right away. And believe me, I spend way more time fixing their shit than LLMs.
If I can reduce this even by 10% for 20 dollars it’s a bargain.
13 replies →
That has not been my experience at all with networking and cryptography.
49 replies →
That proves nothing with respect to the LLMs usefulness, all it means is that you are still useful.
[flagged]
[flagged]
2 replies →
And in the meantime, the people you are competing with in the job market have already become 2x more productive.
15 replies →
[flagged]
Really? I feel like the article pointedly skirted my biggest complaint.
> ## but the code is shitty, like that of a junior developer
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.
I hate pair-programming with junior devs. I hate it. I want to take the keyboard away from them and do it all myself, but I can't, or they'll never learn.
Why would I want a tool that replicates that experience without the benefit of actually helping anyone?
5 replies →
The article provides no solid evidence that "AI is working" for the author.
At the end of the day this article is nothing but another piece of conjecture on hackernews.
Actually assessing the usefulness of AI would require measurements and controls. Nothing has been proven or disproven here
the irony with AI sceptics is that their opinions usually sound like they've been stolen from someone else
This article does not touch on the thing which worries me the most with respect to LLMs: the dependence.
Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work. Whoever controls that system can arbitrarily raise the prices, subtly manipulate the outputs, store and do anything they want with the inputs, or even suddenly cease to operate. And since, according to this article, only the latest and greatest LLM is acceptable (and I've seen that exact same argument six months ago), running locally is not viable (I've seen, in a recent discussion, someone mention a home server with something like 384G of RAM just to run one LLM locally).
To those of us who like Free Software because of the freedom it gives us, this is a severe regression.
Yes, and it's even worse: if you think LLMs may possibly make the world a worse place, you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better.
> you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better
This sounds a bit like bailing out the ocean.
2 replies →
I think that’s a bit of a leap; if you think LLMs make the world a worse place, there are many actions that you might take or not take to try to address that.
2 replies →
It's also why local models, even if less powerful, are so important. The gap between "state of the art" and "good enough for a lot of workflows" is narrowing fast
Yeah I am very excited for local models to get good enough to be properly useful. I’m a bit of an AI skeptic I’ll admit, but I’m much more of a SV venture-backed company skeptic. The idea of being heavily reliant on such a company, plus needing to be online, plus needing to pay money just to get some coding done is pretty unpalatable to me.
Especially with MCP programs that can run in Docker containers.
You can get 90%+ of the way there with a tiny “coder” LLM running on the Ollama backend with an extension like RooCode and a ton of MCP tools.
In fact, MCP is so ground breaking that I consider it to be the actual meat and potatoes of coding AIs. Large models are too monolithic, and knowledge is forever changing. Better just to use a small 14b model (or even 8b in some cases!) with some MCP search tools, a good knowledge graph for memory, and a decent front end for everything. Let it teach itself based on the current context.
And all of that can run on an off the shelf $1k gaming computer from Costco. It’ll be super slow compared to a cloud system (like HDD vs SSD levels of slowness), but it will run in the first place and you’ll get *something* out of it.
Why don't you elaborate on your setup then?
1 reply →
Which MCPs do recommend?
1 reply →
You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.
This is why I run a set of rackmount servers at home, that have the media and apps that I want to consume. If my ISP bites the dust tomorrow, I've literally got years worth of music, books, tv, movies, etc. Hell, I even have a bunch of models on ollama, and an offline copy of wikipedia running (minus media, obv) via kiwix.
It's not off-grid, but that's the eventual dream/ goal.
I'm not the OP but:
* Not even counting cellular data carriers, I have a choice of at least five ISPs in my area. And if things get really bad, I can go down to my local library to politely encamp myself and use their WiFi.
* I've personally no need for a cloud provider, but I've spent a lot of time working on cloud-agnostic stuff. All the major cloud providers (and many of the minors) provide compute, storage (whether block, object, or relational), and network ingress and egress. As long as you don't deliberately tie yourself to the vendor-specific stuff, you're free to choose among all available providers.
* I run Linux. Enough said.
6 replies →
Well, you can’t really self-host your internet connection anyway :)
3 replies →
> You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.
True, but I think wanting to avoid yet another dependency is a good thing.
... search engine
I don't feel like being dependent on LLM coding tools is much of an issue, you can very easily switch between different vendors. And I hope that open weight models will be "good enough" until we get a monopoly. In any case, even if you are afraid of getting too dependent on AI tools, I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months
> I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months
This actually to me implies the opposite of what you’re saying here. Why bother relearning the state of the art every few months, versus waiting for things to stabilize on a set of easy-to-use tools?
We will have the equivalent of Claude Sonnet 4 in a local LLM that can run well on a modern Mac w/ 36+ GB of ram in a year or two. Maybe faster. The local/open models are developing very fast in terms of quantization and how well they can run on consumer hardware.
Folks that are local LLMs everyday now will probably say you can basically emulate at least Sonnet 3.7 for coding if you have an real AI workstation. Which may be true, but the time and effort and cost involved is substantial.
With the Mac Studio you get 512 GB of unified memory (shared between CPU and GPU), this is enough to run some exciting models.
In 20 years, memory has doubled 32x
It means that we could have 16 TB memory computers in 2045.
It can unlock a lot of possibilities. If even 1 TB is not enough by then (better architecture, more compact representation of data, etc).
Yeah, for £10,000. And you get 512GB of bandwidth starved memory.
Still, I suppose that's better than what nvidia has on offer atm (even if a rack of gpus gives you much, much higher memory throughput).
4 replies →
Memory scaling has all but stopped. Current RAM cells are made up of just 40,000 or so electrons (that's when it's first stored. It degrades from there until refreshed). Going smaller is almost impossible due to physics, noise, and the problem of needing to amplify that tiny charge to something usable.
For the past few years, we've been "getting smaller" by getting deeper. The diameter of the cell shrinks, but the depth of the cell goes up. As you can imagine, that doesn't scale very well. Cutting the cylinder diameter in half doubles the depth of the cylinder for the same volume.
If you try to put the cells closer together, you start to get quantum tunneling where electrons would disappear from one cell and appear in another cell altering charges in unexpected ways.
The times of massive memory shrinks are over. That means we have to reduce production costs and have more chips per computer or find a new kind of memory that is mass producible.
Good thing it's a competitive market with at least 5 serious, independent players.
That will work until there has been a lot of infrastructure created to work with a particular player, and 3rd party software.
See the Microsoft ecosystem as an example. Nothing they do could not be replicated, but the network effects they achieved are strong. Too much glue, and 3rd party systems, and also training, and what users are used to, and what workers you could hire are used to, now all point to the MS ecosystem.
In this early mass-AI-use phase you still can easily switch vendors, sure. Just like in the 1980s you could still choose some other OS or office suite (like Star Office - the basis for OpenOffice, Lotus, WordStar, WordPerfect) without paying that kind of ecosystem cost, because it did not exist yet.
Today too much infrastructure and software relies on the systems from one particular company to change easily, even if the competition were able to provide a better piece of software in one area.
Until they all merge, or form a cartel.
Good thing it's funded by generous investors or groups who are okay with losing money on every sale (they'll make it up in volume), and never stop funding, and never raise prices, insert ads or enshittify.
That's going full speed ahead though. Every major cloud provider has an AI offering, and there are now multiple AI-centric cloud providers. There is a lot of money and speculation. Now Nvidia has their own cloud offering that "democratize access to world-class AI infrastructure. Sovereign AI initiatives require a new standard for transparency and performance".
I can't run google on my computer on my own, but I'm totally dependent on it.
Is your entire job returning google results?
The point being made here is that a developer that can only do their primary job of coding via a hosted LLM is entirely dependent on a third party.
7 replies →
I did code before Google, and I was fine. Yes, it's really convenient, and LLM would be even more convenient if I could trust it just a little bit more, but it's quite possible to do some effective software development without Google.
1 reply →
There are many alternatives though. It is not like Google has a search monopoly or office product monopoly, or e-mail provider monopoly. It is quite possible to cut out a lot of Google from one's life, and not even complicated to do that.
2 replies →
>To those of us who like Free Software because of the freedom it gives us, this is a severe regression.
It's fair to be worried about depending on LLM. But I find the dependance on things like AWS or Azure more problematic, if we are talking about centralized and proprietary
It's not like the code is suddenly elsewhere right? If the LLM disappears I'll be annoyed, not helpless.
Not if they only way you know how to code is vibe coding.
Well, I'd think of it like being car-dependent. Sure, plenty of suburbanites know how to walk, they still have feet, but they live somewhere that's designed to only be practically traversable by car. While you've lived that lifestyle, you may have gained weight and lost muscle mass, or developed an intolerance for discomfort to a point where it poses real problems. If you never got a car, or let yourself adapt to life without one, you have to work backwards from that constraint. Likewise with the built environment around us; the cities many people under the age of 40 consider to be "good" are the ones that didn't demolish themselves in the name of highways and automobiles, in which a car only rarely presents what we'd think of as useful technology.
There are all kinds of trades that the car person and the non-car person makes for better or worse depending on the circumstance. The non-car person may miss out on a hobby, or not know why road trips are neat, but they don't have the massive physical and financial liabilities that come with them. The car person meanwhile—in addition to the aforementioned issues—might forget how to grocery shop in smaller quantities, or engage with people out in the world because they just go from point A to B in their private vessel, but they may theoretically engage in more distant varied activities that the non-car person would have to plan for further in advance.
Taking the analogy a step further, each party gradually sets different standards for themselves that push the two archetypes into diametrically opposed positions. The non-car owner's life doesn't just not depend on cars, but is often actively made worse by their presence. For the car person, the presence of people, especially those who don't use a car, gradually becomes over-stimulating; cyclists feel like an imposition, people walking around could attack at any moment, even other cars become the enemy. I once knew someone who'd spent his whole life commuting by car, and when he took a new job downtown, had to confront the reality that not only had he never taken the train, he'd become afraid of taking it.
In this sense, the rise of LLM does remind of the rise of frontend frameworks, bootcamps thay started with React or React Native, high level languages, and even things like having great internet; the only people who ask what happens in a less ideal case are the ones who've either dealt with those constraints first-hand, or have tried to simulate it. If you've never been to the countryside, or a forest, or a hotel, you might never consider how your product responds in a poor connectivity environment, and these are the people who wind up getting lost on basic hiking trails having assumed that their online map would produce relevant information and always be there.
Edit: To clarify, in the analogy, it's clear that cars are not intrinsically bad tools or worthwhile inventions, but had excitement for them been tempered during their rise in commodification and popularity, the feedback loops that ended up all but forcing people to use them in certain regions could have been broken more easily.
Best observation so far. Specially the cost side of using all those APIs ... i pay in dollars, but earn in reais (brazil), the cost scares me.
I think the same argument could be made about search engines. Most people are not too worried about them.
Maybe they should be.
2 replies →
You can run LLMs locally pretty easily, especially if you have a Mac (the unified memory architecture of Macs is really good at this). It's a niche thing but caring about Free Software is niche.
> Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work.
To be fair, the entire internet is basically this already.
You think an LLM provider has a bigger moat than an IDE (say pre vs code for a better parallel). MSDN and Jetbrains licenses are far more expensive than Cursor or Windsurf.
The truth is that majority of people do not care about this. It's why AWS exists. It's why Fly.io exists.
I think 384gb of ram is surprisingly reasonable tbh.
200-300$/month are already 7k in 3 years.
And I do expect some hardware chip based models in a few years like a GPU.
AiPU we're you can replace the hardware ai chip.
> I think 384gb of ram is surprisingly reasonable tbh.
> 200-300$/month are already 7k in 3 years.
Except at current crazy rates of improvement, cloud based models will in reality likely be ~50x better, and you'll still have the same system.
3 replies →
>the dependence.
Sure, but that is not the point of the article. LLMs are useful. The fact that you are dependent on someone else is a different problem like being dependent on microsoft for your office suite.
Even FOSS-based development depends on walled gardens, it is evident every time when GitHub is down.
Sensibly hosted FOSS doesn't go to GitHub for hosting though. There are other options for people who care. I personally like Codeberg.
IMO Github doesn't matter for FOSS because you have a lot of local clones, it won't disappear forever if Github goes down or deletes the repo there. Self-hosted alts are not 100% up either. And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.
1 reply →
do you use GitHub? Vs code? GCP or AWS? The internet, perhaps?All work is dependent on other services, that's the modern world
I disagree.
Self-hosting has always have a lot of drawbacks compared with commercial solutions. I bet my self-host file server has worse reliability than Google Drive, or my self-host git server has worse number of concurrent user than github.
It's one thing you must accept when self-host.
So when you self-host LLM, you must either accept a drop in output quality, or spend a small fortune on hardware
Those aren’t good analogies because it costs nearly nothing to make that availability tradeoff and run things on your computer for your own fun.
Raspberry pi was a huge step forward, the move to LLMs is two steps back.
Wake up, you’re already dependent on everything, unless you stick exclusively to Python std and no outside batteries.
Maven central is gone and you have no proxy setup or your local cache is busted? Poof, you’re fucking gone, all your Springs, Daggers, Quarkuses and every third party crap that makes up your program is gone. Same applies to bazillion JS, Rust libraries.
There are multiple organizations with mirrors for packages, and I doubt if the cost of a mirror is the same as a cost of 384GB memory server.
A guy says here you need 4TB for a PyPi mirror, 285 GB for npm
https://stackoverflow.com/questions/65995150/is-it-possible-...
If PyPI goes out and I cannot use NumPy, I can still roll-out my own implementation of linear algebra library, because I've got the required knowledge, and I've got it because I had to learn it instead rely on LLMs.
1 reply →
Panamax works great for mirroring all of crates.io in 300-400GB, which is big but easily small enough for enthusiasts. I've got it on an external USB drive myself, and it's saved my bacon a few times.
We're not yet to that same point for performance of local LLM models afaict, though I do enjoy messing around them.
One thing that I find truly amazing is just the simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return. Like, as someone who grew up learning to code in the 90s it always seemed like science fiction that we'd get to a point where you could give a computer some vague human level instructions and get it more or less do what you want.
There's the old quote from Babbage:
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
It is kind of impressive how I'll ask for some code in the dumbest, vaguest, sometimes even wrong way, but so long as I have the proper context built up, I can get something pretty close to what I actually wanted. Though I still have problems where I can ask as precisely as possible and get things not even close to what I'm looking for.
> This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
This is not the point of that Babbage quote, and no, LLMs have not solved it, because it cannot be solved, because "garbage in, garbage out" is a fundamental observation of the limits of logic itself, having more to with the laws of thermodynamics than it does with programming. The output of a logical process cannot be more accurate than the inputs to that process; you cannot conjure information out of the ether. The LLM isn't the logical process in this analogy, it's one of the inputs.
6 replies →
We wanted to check the clock at the wrong time but read the correct time. Since a broken clock is right twice a day, we broke the clock, which solves our problem some of the time!
3 replies →
It is fun to watch. I've sometimes indeed seen the LLM say something like "I'm assuming you meant [X]".
It's very impressive that I can type misheard song lyrics into Google, and yet still have the right song pop up.
But, having taken a chance to look at the raw queries people type into apps, I'm afraid neither machine nor human is going to make sense of a lot of it.
theseday,s i ofen donot correct my typos even wheni notice them while cahtting with LLMS. So far 0 issues.
We're talking about God function.
function God (any param you can think of) {
}
Well, you can enter 4-5 relatively vague keywords into google and first/second stackoverflow link will probably provide plenty of relevant code. Given that, its much less impressive since >95% of the problems and queries just keep repeating.
How do you know the code is right?
14 replies →
Sure, you can now be fuzzy with the input you give to computers, but in return the computer will ALSO be fuzzy with the answer it gives back. That's the drawback of modern AI.
It can give back code though. It might be wrong, but it won’t be ambiguous.
4 replies →
It's mind blowing. At least 1-2x/week I find myself shocked that this is the reality we live in
Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.
I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.
I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?
It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.
Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools
105 replies →
It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’
I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.
6 replies →
I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.
It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.
1 reply →
I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)
1 reply →
Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.
Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.
The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.
7 replies →
[flagged]
10 replies →
You can be fuzzier than a soft fluff of cotton wool. I’ve had incredible success trying to find the name of an old TV show or specific episode using AIs. The hit rate is surprisingly good even when using the vaguest inputs.
“You know, that show in the 80s or 90s… maybe 2000s with the people that… did things and maybe didn’t do things.”
“You might be thinking of episode 11 of season 4 of such and such snow where a key plot element was both doing and not doing things on the penalty of death”
See I try that sort of thing, like asking Gemini about a science fiction book I read in 5th grade that (IIRC) involved people living underground near/under a volcano, and food in pill form, and it immediately hallucinates a non-existent book by John Christopher named "The City Under the Volcano"
5 replies →
Wake me up when LLMs render the world a better place by simply prompting them "make me happy". Now that's gonna be a true win of fuzzy inputs!
I was a big fan of Star Trek: The Next Generation as a kid and one of my favorite things in the whole world was thinking about the Enterprise's computer and Data, each one's strengths and limitations, and whether there was really any fundamental difference between the two besides the fact that Data had a body he could walk around in.
The Enterprise computer was (usually) portrayed as fairly close to what we have now with today's "AI": it could synthesize, analyze, and summarize the entirety of Federation knowledge and perform actions on behalf of the user. This is what we are using LLMs for now. In general, the shipboard computer didn't hallucinate except during most of the numerous holodeck episodes. It could rewrite portions of its own code when the plot demanded it.
Data had, in theory, a personality. But that personality was basically, "acting like a pedantic robot." We are told he is able to grow intellectually and acquire skills, but with perfect memory and fine motor control, he can already basically "do" any human endeavor with a few milliseconds of research. Although things involving human emotion (art, comedy, love) he is pretty bad at and has to settle for sampling, distilling, and imitating thousands to millions of examples of human creation. (Not unlike "AI" art of today.)
Side notes about some of the dodgy writing:
A few early epsiodes of Star Trek: The Next Generation treated the Enterprise D computer as a semi-omniscient character and it always bugged me. Because it seemed to "know" things that it shouldn't and draw conclusions that it really shouldn't have been able to. "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!" Thankfully someone got the memo and that only happened a few times. Although I always enjoyed episodes that centered around the ship or crew itself somehow instead of just another run-in with aliens.
The writers were always adamant that Data had no emotions (when not fitted with the emotion chip) but we heard him say things _all the time_ that were rooted in emotion, they were just not particularly strong emotions. And he claimed to not grasp humor, but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
ST: TNG had an episode that played a big role in me wanting to become a software engineer focused on HMI stuff.
It's the relatively crummy season 4 episode Identity Crisis, in which the Enterprise arrives at a planet to check up on an away team containing a college friend of Geordi's, only to find the place deserted. All they have to go on is a bodycam video from one of the away team members.
The centerpiece of the episode is an extended sequence of Geordi working in close collaboration with the Enterprise computer to analyze the footage and figure out what happened, which takes him from a touchscreen-and-keyboard workstation (where he interacts by voice, touch and typing) to the holodeck, where the interaction continues seamlessly. Eventually he and the computer figure out there's a seemingly invisible object casting a shadow in the reconstructed 3D scene and back-project a humanoid form and they figure out everyone's still around, just diseased and ... invisible.
I immediately loved that entire sequence as a child, it was so engrossingly geeky. I kept thinking about how the mixed-mode interaction would work, how to package and take all that state between different workstations and rooms, have it all go from 2D to 3D, etc. Great stuff.
3 replies →
>"Being a robot's great, but we don't have emotions and sometimes that makes me very sad".
From Futurama in a obvious parody of how Data was portrayed
I always thought that Data had an innate ability to learn emotions, learn empathy, learn how to be human because he desired it. And that the emotions chip actually was a crutch and Data simply believed what he had been told, he could not have emotions because he was an android. But, as you say, he clearly feels close to Geordi and cares about him. He is afraid if Spot is missing. He paints and creates music and art that reflects his experience. Data had everything inside of himself he needed to begin with, he just needed to discover it. Data, was an example to the rest of us. At least in TNG. In the movies he was a crazy person. But so was everyone else.
1 reply →
> The writers were always adamant that Data had no emotions... but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
This doesn't seem too different from how our current AI chatbots don't actually understand humor or have emotions, but can still explain a joke to you or generate text with a humorous tone if you ask them to based on samples, right?
> "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!"
I'm curious, do you recall a specific episode or two that reflect what you feel boiled down to this?
Thanks, love this – it's something I've thought about as well!
It's a radical change in human/computer interface. Now, for many applications, it is much better to present the user with a simple chat window and allow them to type natural language into it, rather than ask them to learn a complex UI. I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
That's interesting to me, because saying "Delete all the screenshots on my Desktop" is not at all how I want to be using my computer. When I'm getting breakfast, I don't instruct the banana to "peel yourself and leap into my mouth," then flop open my jaw like a guppy. I just grab it and eat it. I don't want to tell my computer to delete all the screenshots (except for this or that that particular one). I want to pull one aside, sweep my mouse over the others, and tap "delete" to vanish them.
There's a "speaking and interpreting instructions" vibe to your answer which is at odds with my desire for an interface that feels like an extension of my body. For the most part, I don't want English to be an intermediary between my intent and the computer. I want to do, not tell.
10 replies →
It’s very interesting to me that you chose deleting files as a thing you don’t mind being less precise about.
I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.
1 reply →
> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
Both are valid cases, but one cannot replace the other—just like elevators and stairs. The presence of an elevator doesn't eliminate the need for stairs.
> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
But why? It takes many more characters to type :)
4 replies →
The junior will repeatedly ask the AI to delete the screenshots. Until he forgets what is the command to delete a file.
The engineer will wonder why his desktop is filled his screenshots, change the settings that make it happen, and forget about it.
That behavior happened for years before AI, but AI will make that problem exponentially worse. Or I do hope that was a bad example.
2 replies →
This is why even if LLMs top out right now, their will still be a radical shift in how we interact with and use software going forward. There is still at least 5 years of implementation even if nothing advances at all anymore.
No one is ever going to want to touch a settings menu again.
3 replies →
For me this moment came when Google calendar first let you enter fuzzy text to get calendar events added, this was around 2011, I think. In any case, for the end user this can be made to happen even when the computer cannot actually handle fuzzy inputs (which is of course, how an LLM works).
The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do. I remember people behaving like that around stocks not that long ago…
> The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do
True, but I think this is just the zeitgeist. People today want to share their dumb opinions about any complex subject after they saw a 30 second reel.
2 replies →
Though I haven’t embraced LLM codegen (except for non-functional filler/test data), the fuzziness is why I like to use them as talking documentation. It makes for a lot less of fumbling around in the dark trying to figure out the magic combination of search keywords to surface the information needed, which can save a lot of time in aggregate.
Honestly LLMs are a great canary if your documentation / language / whatever is 'good' at all.
I wish I would have kept it around but had ran into an issue where the LLM wasn't giving a great answer. Look at the documentation, and yea, made no sense. And all the forum stuff about it was people throwing out random guessing on how it should actually work.
If you're a company that makes something even moderately popular and LLMs are producing really bad answers there is one of two things happening.
1. Your a consulting company that makes their money by selling confused users solutions to your crappy product 2. Your documentation is confusing crap.
1 reply →
I've just got good at reading code, because that's the one constant you can rely one (unless you're using some licensed library). So whenever the reference is not enough, I just jump straight to the code (one of my latest examples is finding out that opendoas (a sudo replacement) hard code the persist option for not asking password to 5 minutes).
I literally pasted these two lines into ChatGPT that were sent to me by one of sysadmin and it told me exactly what I needed to know:
I use it like that all time. In fact, I'm starting to give it less and less context and just toss stuff at it. It's more efficient use of my time.
In my opinion, most of the problems we see now with LLMs come from being fuzzy ... I'm used to getting very good code from claude o gemini (copy and paste without any changes that just works) but I have to be very specific, sometime it takes longer to write the prompt than writing the code itself.
If I'm fuzzy, the output quality is usually low and I need several iterations before getting an acceptable result.
At some point, in the future, there will be some kind of formalization on how to ask swe question to llms ... and we will get another programming language to rule the all :D
It invalidates this CinemaSins nitpick on Alien completely
https://youtu.be/dJtYDb7YaJ4?si=5NuoXaW0pkGoBSJu&t=76
To me this is the best thing about LLMs.
Computers finally work they way they were always supposed to work :)
But when I'm doing my job as a software developer, I don't want to be fuzzy. I want to be exact at telling the computer what to do, and for that, the most efficient way is still a programming language, not English. The only place where LLMs are an improvement is voice assistants. But voice assistants themselves are rather niche.
I want to be fuzzy and I want the LLM to generate something exact.
2 replies →
It can get you 80% of the way there, you can still be exacting in telling it where it went wrong or fine tuning the result by hand.
>simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return
I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want. Worth reading Dijkstra, who anticipated this, and the foolishness of it, half a century ago
"Instead of regarding the obligation to use formal symbols as a burden, we should regard the convenience of using them as a privilege: thanks to them, school children can learn to do what in earlier days only genius could achieve. (This was evidently not understood by the author that wrote —in 1977— in the preface of a technical report that "even the standard symbols used for logical connectives have been avoided for the sake of clarity". The occurrence of that sentence suggests that the author's misunderstanding is not confined to him alone.) When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.[...]
It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable"
Welcome to prompt engineering and vibe coding in 2025, where you have to argue with your computer to produce a formal language, that we invented in the first place so as to not have to argue in imprecise language
https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
right: we don't use programming languages instead of natural language simply to make it hard. For the same reason, we use a restricted dialect of natural language when writing math proofs -- using constrained languages reduces ambiguity and provides guardrails for understanding. It gives us some hope of understanding the behavior of systems and having confidence in their outputs
There are levels of this though -- there are few instances where you actually need formal correctness. For most software, the stakes just aren't that high, all you need is predictable behavior in the "happy path", and to be within some forgiving neighborhood of "correct".
That said, those championing AI have done a very poor job at communicating the value of constrained languages, instead preferring to parrot this (decades and decades and decades old) dream of "specify systems in natural language"
1 reply →
It sounds like you think I don't find value in using machines in their precise way, but that's not a correct assumption. I love code! I love the algorithms and data structures of data science. I also love driving 5-speed transmissions and shooting on analog film – but it isn't always what's needed in a particular context or for a particular problem. There are lots of areas where a 'good enough solution done quickly' is way more valuable than a 100% correct and predictable solution.
1 reply →
That’s interesting. I got into computing because unlike school where wrong answers gave you indelible red ink and teachers had only finite time for questions, computers were infinitely patient and forgiving. I could experiment, be wrong, and fix things. Yes I appreciated that I could calculate precise answers but it was much more about the process of getting to those answers in an environment that encouraged experimentation. Years later I get huge value from LLMs, where I can ask exceedingly dumb questions to an indefatigable if slightly scatterbrained teacher. If I were smart enough, like Dijkstra, to be right first time about everything, I’d probably find them less useful, but sadly I need cajoling along the way.
"I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want."
So you didn't get into this profession to be lead then eh?
Because essentially, that's what Thomas in the article is describing (even if he doesn't realize it). He is a mini-lead with a team of a few junior and lower-mid-level engineers - all represented by LLM and agents he's built.
1 reply →
The other side of the coin is that if you give it a precise input, it will fuzzily interpret it as something else that is easier to solve.
Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.
But thankfully we do have feedback/interactiveness to get around the downsides.
When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.
Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.
It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool
(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)
10 replies →
[dead]
>On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage
This quote did not age well
Now with LLMs, you can put in the right figures and the wrong answers might come out.
not if you consider how confused our ideas are today
If anything we now need to unlearn the rigidity - being too formal can make the AI overly focused on certain aspects, and is in general poor UX. You can always tell legacy man-made code because it is extremely inflexible and requires the user to know terminology and usage implicitly lest it break, hard.
For once, as developers we are actually using computers how normal people always wished they worked and were turned away frustratedly. We now need to blend our precise formal approach with these capabilities to make it all actually work the way it always should have.
I'm not a skeptic, but I keep LLMs on a short leash.
This is a thoughtful article. Thanks `tptacek
My LLM use is: 1 - tedious stuff; web pages interacting with domain back end. 2 - domain discovery.
In a recent adventure, I used Claude 4 to tease out parameters in a large graph schema. This is a combination of tedium and domain discovery (it's not my graph and I'm not a domain expert). In the first day, Claude uncovered attributes and relations no other LLM or Google search uncovered. And it worked!! The next day, I allowed it to continue. After a bit, results didn't pass the sniff test.
I checked into details of Claude's thinking: it decided to start making up schema attributes and inventing fallback queries on error with more made up attributes. It was "conscious" of its decision to do so. By the time I caught this, Claude had polluted quite a bit of code. Sure, plenty of well placed git commits helped in rolling back code...but it's not quite that simple..over the many git commits were sprinkled plenty of learnings I don't want to toss. It took another two days of carefully going through the code to pull out the good stuff and then roll things back. So now I'm at day five of this adventure with cleaned up code and notes on what we learned.
I suspect continual improvements on tooling will help. Until then, it's a short leash.
Domain discovery is so underrated. LLMs remove so much friction that makes everything so incredibly accessible.
If LLMs couldn't do anything else then that alone would still warrant an invention of a century sticker.
Couldn't agree more. The first time I used Claude Code was for something very much like this. We had a PDF rendering issue with Unicode characters in one of our libraries. We ultimately needed to implement a sort of bespoke font fallback system.
With the help of the agent, I was able to iterate through several potential approaches and find the gaps and limitations within the space of an afternoon. By the time we got to the end of that process the LLM wrote up a nice doc of notes on the experiments, and *I* knew what I wanted to do next. Knowing that, I was able to give a more detailed and specific prompt to Claude which then scaffolded out a solution. I spent probably another day tweaking, testing, and cleaning up.
Overall I think it's completely fair to say that Claude saved me a week of dev time on this particular task. The amount of reading and learning and iterating I'd have had to do to get the same result would have just taken 3-4 days of work. (not to mention the number of hours I might have wasted when I got stuck and scrolled HN for an hour or whatever).
So it still needed my discernment and guidance - but there's no question that I moved through the process much quicker than I would have unassisted.
That's worth the $8 in API credit ten times over and no amount of parroting the "stochastic parrot" phrase (see what I did there?) would change my mind.
13 replies →
> Domain discovery is so underrated. LLMs remove so much friction that makes everything so incredibly accessible.
And, unfortunately, a lot of friction from not having access to information in the first place. I've read a bunch of docs from people talking to Glean in order to explore a new topic; if it's a topic I'm actually very familiar with then four out of five times it is somewhere from misleading to catastrophically wrong. Any internal terminology that doesn't match the common usage outside of our organization poisons the whole session and it will make up things to join the meanings together, and the prompter is none the wiser.
I trust AI only as a gap filler in domains that I'm already an expert in or where there's little internal context, anything else is intellectual suicide.
1 reply →
No doubt. The tedious stuff is just a time saver. The domain discovery is often the difference between making progress and giving up.
One question is whether, even after all that backpedaling, you feel you could've achieved the same or a similar result in those five days. My findings have been that it's a net plus for productivity, but I'm a bit less sure whether I prefer the way work feels when a lot of it is just going back and cleaning up after the growth. (Of course, that sounds like a familiar statement for a lot of engineers before LLMs, too.)
This is why agents suck.
Backpedling is a massive inefficiency.
A better way is the single clean step approach.
Use the largest LLM you can. Have it generate a single output for one update.
If that update has logical errors or dropped anything you asked for restart, refine, narrow until it does.
It's quite hard to plan each step right but the level and conplexity you can get to is far higher than an agent.
Agents are much better at the shallow/broad problems.
Large LLMs are exponentially better deep/narrow problems.
Yeah I'm impressed with its ability to do stuff, but not quite with its results. We have been working on more AI assistance adoption so I asked it to do some decently complex things with json/yml schema definitions and validations (outside the typical json schema we wanted things like conditional validation, etc)... It wrote a LOT of code and took a long time, and kept telling me it would work, and it didn't. I finally stepped in and eliminated roughly 75% of the code in about 10 minutes and got it working. It's great at tedious stuff, but outside of that, I'm skeptical.
IMO, you just noted it’s great at creating tedious (but pointless) stuff?
I use AI every day, basically as a "pair coder."
I used it about 15 minutes ago, to help me diagnose a UI issue I was having. It gave me an answer that I would have figured out, in about 30 minutes, in about 30 seconds. My coding style (large files, with multiple classes, well-documented) works well for AI. I can literally dump the entire file into the prompt, and it can scan it in milliseconds.
I also use it to help me learn about new stuff, and the "proper" way to do things.
Basically, what I used to use StackOverflow for, but without the sneering, and much faster turnaround. I'm not afraid to ask "stupid" questions -That is critical.
Like SO, I have to take what it gives me, with a grain of salt. It's usually too verbose, and doesn't always match my style, so I end up doing a lot of refactoring. It can also give rather "naive" answers, that I can refine. The important thing, is that I usually get something that works, so I can walk it back, and figure out a better way.
I also won't add code to my project, that I don't understand, and the refactoring helps me, there.
I have found the best help comes from ChatGPT. I heard that Claude was supposed to be better, but I haven't seen that.
I don't use agents. I've not really ever found automated pipelines to be useful, in my case, and that's sort of what agents would do for me. I may change my mind on that, as I learn more.
I use it as a SO stand in as well.
What I like about Chatbots vs SO is the ability to keep a running conversation instead of 3+ tabs and tuning the specificity toward my problem.
I've also noticed that if I look up my same question on SO I often find the source code the LLM copied. My fear is that if chatbots kill SO where will the LLM's copied code come from in the future?
I use Perplexity as my daily driver and it seems to be pretty good at piecing together the path forward from documentation as it has that built-in web search when you ask a question. Hopefully LLMs go more in that direction and less in the SO copy-paste direction, sidestepping the ouroboros issue.
Not a dev. SO done for then? It's been an important part of history.
5 replies →
That's an issue. It will likely turn into a Worm Ouroboros.
There's usually some "iteration," with ChatGPT giving me deprecated APIs and whatnot.
>>I'm not afraid to ask "stupid" questions -That is critical.
AI won't judge and shame you in front of the whole world, for asking stupid questions, or not RTFM'ing well enought, like Stackoverflow users do. Nor will it tell you, your questions are irrelevant.
I think this is the most killer AI feature ever.
I’ve always worked that way. In school (or in seminars), I ask questions that may have the whole room in stitches, but I always learn the lesson. The worst teacher I ever had, was a genius calculus professor, who would harangue you in front of the class, for asking a “stupid” question. That’s the only class I ever took an Incomplete.
That’s the one thing about SO that I always found infuriating. It seems their favorite shade, is inferring that you’re “lazy,” and shaming you for not already having the answer. If anyone has ever looked at my code, “lazy” is probably not a word that springs to mind.
In most cases, I could definitely get the answer, myself, but it would take a while, and getting pointers might save me hours. I just need a hint, so that I can work out an answer.
With SO, I usually just bit my tongue, and accepted the slap, as well as the answer.
An LLM can actually look at a large block of code, and determine some boneheaded typo I made. That’s exactly what it did, yesterday. I just dumped my entire file into it, and said “I am bereft of clue. Do you have any idea why the tab items aren’t enabling properly?”. It then said “Yes, it’s because you didn’t propagate the tag from the wrapper into the custom view, here.” It not only pointed out the source error, but also explained how it resulted in the observed symptoms.
In a few seconds, it not only analyzed, but understood an entire 500-line view controller source file, and saw my mistake, which was just failing to do one extra step in an initializer.
There’s absolutely no way that I could have asked that question on SO. It would have been closed down, immediately. Instead, I had the answer in ten seconds.
I do think that LLMs are likely to “train” us to not “think things through,” but they said the same thing about using calculators. Calculators just freed us up to think about more important stuff. I am not so good at arithmetic, these days, but I no longer need to be. It’s like Machine Code. I learned it, but don’t miss it.
3 replies →
Agree on the verbosity and occasional naivety. But the fact that it gives working starting points is what really moves the needle. It gets me unstuck faster, and I still get to do the creative, architectural stuff
Yup.
I’ll ask it how to accomplish some task that I’ve not done, before, and it will give me a working solution. It won’t necessarily be a good solution, but it will work.
I can then figure out how it got there, and maybe determine a more effective/efficient manner.
The reaction to this article is interesting. I have found AI to be useful in software contexts that most people never exercise or expect based on their intuitions of what an LLM can do.
For me, a highly productive but boring use of LLMs for code is that they excel at providing midwit “best practice” solutions to common problems. They are better documentation than the documentation and can do a lot of leg work e.g. Linux syscall implementation details. My application domains tend to require more sophisticated solutions than an LLM can provide but they still save a lot of rote effort. A lot of software development exists almost entirely in the midwit zone.
Much more interesting, they are decent at reducing concepts in literature to code practice for which there are no code examples. Google and StackOverflow turn up nothing. For example, I’ve found them useful for generating specialized implementations of non-Euclidean computational geometry algorithms that don’t really exist in the wild that I’ve ever seen. This is a big win, it literally turns months of effort into hours of effort.
On the other hand, I do a lot of work with algorithms that don’t exist in literature, never mind public code, with extremely performance-engineered implementations. There is an important take away from this too: LLMs are hilariously bad at helping with this but so are human software developers if required to do the same thing with no context.
Knowledge for which there is little or no training data is currently a formidable moat, both for LLMs and humans.
I agree on the distillation step. They're very good at identifying what problem you're trying to solve, especially if you don't have much experience in the domain. I've found ChatGPT great for figuring out problems involving combinatorics. And then you can be canny and ask the LLM to write the next prompt to get some boilerplate code. I like to do this with ChatGPT 4.5 (heavily rate limited) targeting 4o.
Getting good code is still a pain with chat, but you can go a fair way by reducing the context as much as possible and only asking for core functions (no point wasting hundreds of tokens on an arg parser). I'd like to play with Cursor or Zed a bit more.
Debugging sometimes works. 4.5 often figures out errors in one, 4o often struggles and spits out suggestions like it's a customer support agent (it also falls into repetition fairly often if you tell it "no, so what else could it be?"). You really have to work to convince it up front that you're not an idiot.
The moat isn't intelligence, it's context
What? This is exactly what I would expect an LLM to be good at. Stack Overflow gives common solutions to common problems. LLMs give niche solutions to problems that aren't common enough for extensive discussion, but are still already solved if you find the right repository or study the documentation thoroughly.
Neither helps with the hardest and most time consuming problems faced by software engineers: solving problems that have never been (publicly) solved before, making hard decisions based on intuition about what will be necessary in the future, understanding the interactions between broad webs of dependency in large codebases, and working with proprietary domain knowledge.
It's fascinating how over the past year we have had almost daily posts like this one, yet from the outside everything looks exactly the same, isn't that very weird?
Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?
I invite tptacek, or any other chatbot enthusiast around, to publish project metrics and show some actual numbers.
"Why haven't we seen an explosion of new start-ups, products or features?"
You're posting this question on a forum hosted by YC. Here's a story from March 2024: "YC’s latest W24 batch includes 240 companies. A significant portion of the companies have some AI component, with 63% tagged as “Artificial Intelligence” — a notable increase from 51% in the preceding S23 batch and 29% before that.". https://jamesin.substack.com/p/analysis-of-ycs-latest-w24-ba...
I've not seen the same analysis for more recent batches.
I don't think that refutes the parent's point. So many AI companies, but where are the companies _using_ the AI?
3 replies →
Sorry I don't follow, would you mind clarifying your point?
7 replies →
Most likely there’s a slight productivity increase.
The enthusiasts have a cognitive dissonance because they are pretty sure this is huge and we’re living in the future, so they go through various denial strategies when the execs ask them where the money is.
In this case it’s blame. These darned skeptics are ruining it for everyone.
This is an important question. The skepticism tracks with my personal experience - I feel 10-20% more productive but certainly not 5x when measured over a long period of time (say, the last 6 months or more)
I’m nonetheless willing to be patient and see how it plays out. If I’m skeptical about some grandiose claims I must also be equally skeptical and accepting about the possibility of large scale effects happening but not being apparent to me yet.
There were many similar transformations in recent decades. I remember first Windows with true Graphics User Interface was big WOW: productivity boost, you can have all those windows and programs running at the same time! Compare it with DOS where you normally had just one active user-facing process.
> Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?
In my personal experience (LLM and code suggestion only) it's because I use LLMs to code unimportant stuff. Actually thinking what I want to do with the business code is exhausting and I'd rather play a little with a fun project. Also, the unit tests that LLMs can now write (and which were too expensive to write myself) were never important to begin with.
Simply put, if we’re living during such a major technological revolution, why does using software suck in such disastrous ways that were unthinkable even ten years ago?
Your argument relies on the idea of an "actual product", what is happening—and I’m seeing it firsthand both in my company’s codebase and in my personal projects—is that AI is contributing more and more to product development. If this trend continues, we may reach a point where 90% of a product is written by AI.
At that stage, the real value will lie in the remaining 10%—the part that requires human judgment, creativity, or architectural thinking. The rest will be seen as routine: simple instructions, redundant CRUD operations, boilerplate, and glue code.
If we focus only on the end result, human will inevitably write less code overall. And writing less code means fewer programming jobs.
You said a bunch without saying much. It also doesn't track. If the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?
Call me naive, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that.
I have no doubt that people find use in some aspects of these tools. Though I personally more subscribe to the interactive rubber ducky usage of them. But 90% from where I am standing seems like a very, very far way off.
6 replies →
i don't know - I agree we haven't seen changes to our built environment, but as for an "explosion of new start-ups, products" we sort of are seeing that?
I see new AI assisted products everyday, and a lot of them have real usage. Beyond the code-assistants/gen companies which are very real examples, here's an anecdote.
I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.
Then i wanted to plan a trip - tried google but saw nothing useful, and then asked chatgpt and now have a clear plan
> I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.
I am not seeing anything indicating it is actually used by a ton of journalists and serious writers. I highly doubt it is, the FAQ is also paper thin in as far as substance goes. I highly doubt they are training/hosting their own models yet I see only vague third party references in their privacy policy. Their pricing is less than transparent given that they don't really explain how their "credits" translate to actual usage. They blatantly advertise this to be for students, which is problematic in itself.
This ignores all the other issues around so heavily depending on LLMs for your writing. This is an interesting quirk for starters: https://www.theguardian.com/technology/2024/apr/16/techscape... . But there are many more issues about relying so heavily on LLM tools for writing.
So this example, to me, is actually exemplifying the issue of overselling capabilities while handwaving away any potential issues that is so prevalent in the AI space.
4 replies →
Maybe because these companies are smaller and fly under the radar. They require less funding team size is small, probably bankrolled by the founders.
At least that's what I do and what I see among friends.
So as an example of what this could look like that would be convincing to me. I started out pretty firmly believing that Rust was a fad.
Then Mozilla and Google did things with it that I did not think were possible for them to do. Not "they wrote a bunch of code with it", stuff like "they eliminated an entire class of bugs from a section of their codebase."
Then I watched a bunch of essentially hobby developers write kernel drivers for brand new architecture, and watched them turn brand new Macbooks into one of the best-in-class ways to run Linux. I do not believe they could have done that with their resources at that speed, using C or C++.
And at that point, you kind of begrudgingly say, "okay, I don't know if I like this, but fine, heck you, whatever. I guess it might genuinely redefine some parts of software development, you win."
So this is not impossible. You can convince devs like me that your tools are real and they work.
And frankly, there are a billion problems in modern computing that are high impact - stuff like Gnome accessibility, competitive browser engines, FOSS UX, collaboration tools. Entire demographics who have serious problems that could be solved by software if there was enough expertise and time and there were resources to solve them. Often, the issue at play is that there is no intersection between people who are very well acquainted with those communities and understand their needs, and people who have experience writing software.
In theory, LLMs help solve this. In theory. If you're a good programmer, and suddenly you have a tool that makes you 4x as productive as a developer: you could have a very serious impact on a lot of communities right now. I have not seen it happen. Not in the enterprise world, but also not in the FOSS world, not in communities with lower technical resources, not in the public sector. And again, I can be convinced by this, I have dismissed tools that I later switched opinions on because I saw the impact and I couldn't ignore the impact: Rust, NodeJS, Flatpak, etc, etc.
The problem is people have been telling me that Coding Assistants (and now Coding Agents) are one of those tools for multiple years now, and I'm still waiting to see the impact. I'm not waiting to see how many companies pick them up, I'm not waiting to see the job market. I'm waiting to see if this means that real stuff starts getting written at a higher quality significantly faster, and I don't see it.
I see a lot of individual devs showing me hobby projects, and a lot of AI startups, and... frankly, not much else.
and builder.ai just filed for bankruptcy after a billion dollar valuation. Timely.
One of our AI enabled internal projects is moving ~135 faster than before. Of course you can't perfectly compare. New framework, better insights, updated libraries etc.
Bless your metric.
If you end up finishing it in 6 months, are you going to revise that estimate, or celebrate the fact that you don't need to wait until 2092 to use the project?
2 replies →
My main concern is not even mentioned in this article and there are hardly any comments here addressing it: Privacy / allowing 3rd parties to read and potentially train on your proprietary source code.
I've used LLMs to crank out code for tedious things (like generating C-APIs and calling into poorly documented libraries) but I'm not letting them touch my code until I can run it 100% locally offline. Would love to use the agentic stuff but from what I've heard it's still too slow to run on a high end workstation with a single 4080.
Or have things got better lately, and crucially is there good VisualStudio integration for running local agents / LLMs?
The problem is, it's like ads. Even if it's a paid subscription, the incentive to hoover up your code, metadata about your code or everything they can get away with is just too strong. I think it will just happen at some point, unless you run them 100% locally or hosted somewhere but self managed.
Yes and even if they were squeaky clean and didn't do anything bad with your code, there's also the possibility that they've been hacked and code is getting siphoned off somewhere.
If you're big enough, you pay for on-prem frontier level tools. If not, you can do all right with a good GPU and local model, but it's not the same. And yeah there are plenty of plugins like Continue that support local or on-prem hosted models.
Do you store code on GitHub? If so, how is GH's guarantee to not use your code different from Cursor's (with privacy mode enabled)?
No I definitely don't use GitHub. Everything is entirely in-house.
But even if I did, there's a much more solid foundation of trust there, whereas these AI companies have been very shady with their 'better to ask for forgiveness, than permission' attitudes of late.
2 replies →
Cursor has no privacy mode whatsoever. I have been able to exfiltrate just about anything from it.
Do you commit your client code to Github?
Cursor has "privacy mode" in Pro mode that supposedly prevents training on your code.
So would you be interested in a simple serverless service that lets you run LLM’s and is focused on privacy and security first instead of trying to run it on an expensive workstation with a single 4080?
> but I'm not letting them touch my code until I can run it 100% locally offline
You can do this already with Ollama, RooCode, and a Docker compatible container engine.
Honestly is your code that revolutionary? Let's say I steal it, what can I do with it?
I’m AI neutral but the writing style here is pretty dismissive, and - to match the tone of the article - annoying as fuck.
Most completely reasonable objections to LLMs were totally dismissed.
Seems like very common AI hype post style.
"If you aren't using this AI feature you are left behind."
I usually block people for posting like that.
Right? It's gripping reading about the author's insecurities, but fails to say anything useful about AI.
Yeah, this was one of the weakest pieces of writing on GenAI that I've ever read, and I've read a lot of it.
We should flat-out dismiss the issue of licenses and copyright violations because... all developers pirate TV shows?
Seriously... what a load of crap.
All this smart people is dumber than me.
I love LLMs, and I really like programming with Cursor, but I never managed to get the "agents with tons of stuff in their context" mode to work for me. I use Cursor like a glorified code completer, 4-5 lines at a time, because otherwise the LLM just makes too many mistakes that compound.
If you let it run in the "write my code for me" mode, and ask it to fix some mistake it made, it will always add more code, never remove any. In my experience, in the end the code just ends up so brittle that the LLM will soon get stuck at a point that it never manages to overcome some mistake, no matter how many times it tries.
Has anyone managed to solve this?
Not a full solution, but one thing I've learned not to do is tell Cursor "you got that wrong, fix it like this". Instead, I go back to the previous prompt and click "Restore Checkpoint", edit the prompt and possibly the Cursor rules to steer it in the right direction.
When the model has the wrong solution in its context, it will use it when generating new code, and my feeling is that it doesn't handle the idea of "negative example" very well. Instead, delete the bad code and give it positive examples of the right approach.
Make sure you include lots of files of context (including any type definitions!) After a big change before approving any code ask it: "Is this the simplest and cleanest approach?" Usually it will make more revisions and actually produce clean code then. You can also say that in the original prompt, or specify the structure of the change you want it to make.
Oh nice, I'll try the simple thing, thanks!
No. And I don't think they are doing anything magical. Performance drop sharply after 50k tokens. Your LLM does best when you have a short 2-5K context.
imo cursor's small context window on the $20 plan is what kills their agent mode. Try claude code or anything that lets you use larger context windows (I think cursor has a tier you can pay for now?). claude code regularly deletes and corrects code and LLMs are very much capable of that today.
I would love to know as well. I also have problems with the LLM coding agents when it comes to lesser-known languages like Julia. Has anybody solved this yet?
in the last two weeks I've started using Cursor in the 'you do everything, I'm just your boss mode', to see how far I can push it. Right at this minute, I'm working on something thats become pretty big but I'm constantly on the verge of just going back to writing code like normal LOL.
That said, I'm super impressed by how quickly I've been able to get this far with something pretty niche and complex in places.
Here's what I've learned. There are million AI Bros on youtube who have the ultimate solution but they all boil down to a few basic things.
Make rules: make them before you get started and continue updating them as you go.
Lots of tiny modules: Push things into small bite sized bits of code with lots of interface documentation. This feels a bit unnatural when the code is prototype quality.
Documentation is key: the youtubers will often create detailed planning and specification documents in advance. I done this and it's hit and miss, what I've found works is explain what you want to build to an llm and have it create extremely concise documentation, then a rough checklist for implementation and then evolving these documents in cursor as I go.
This leads to a kind of plan --> document, implement --> document, run tests --> document workflow on each reasonable chunk of the design being the most smooth.
Don't let cursor ever see inside big datasets or huge folder trees. In fact keep cursor in a few folders writing source code and nothing else. To do this, early on build tools that can go and get information or make safe edits for cursor from those datasets without it attempting direct access.
The current project has a tools for working with the primary datasets, a job manager, a process manager, a log manager, a workflow manager, these all have functions for querying. Cursor is instructed to use these. It naturally doesn't want to but if you tell it 7 times out 10 it will :)
No mess: Cursor will like to make lots of random tests and processes along the way. instruct it to use a /debug/tests/ folder and wipe it clean often. force it to make 'production' code by having it be registered with the workflow manager and made allowable by process manager. This lets it play around for a while and get something up and running with it's weird little scripts and then implement it for real elsewhere and using the proper framework. The workflow manager needs to have documentation on how the script is used and the process manager needs to the source to be in a particular place with a particular set of standard interfaces.
you might say this is a lot of plumbing, but what isn't these days and it's not like I'm maintaining it - right ;)
Cursor is always about to go insane any minute, or can't remember what's going on. So most of it is about keeping things simple enough for it to focus on the bit it's working on and nothing else.
Regularly for work we have massive code bases written in our company style and built using our business processes. This kind of thing is absolutely not the place for Cursor. I can imagine a company setup from scratch to use Cursor but our current products and practices woud just make it impossibly time-consuming.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.
A junior developer often has negative value to a team, because they're sapping the time of more senior developers who have to help train them, review code, fix mistakes, etc. It can take a long while to break even.
The raw cost of Cursor's subscription is surely dwarfed by your own efforts, given that description. The actual calculous here should be the cost to corral Cursor, against the value of the code it generated.
So what is the end game here, we don't have any juniors? That's what we're celebrating? This whole thing makes no damn sense, we're just becoming more efficient at producing bad code at scale, which will be maintained by a workforce that is growing less proficient at that with the increasing adoption of AI - skills will decline and eventually we'll be living with more buggy software, maintained by underpaid, unappreciated and overwhelmed grey beards and it will be too late to turn back.
I had a call with a recruiter the other day, who told me that they had simply stopped hiring juniors at all, when not too long ago they would be camping out at colleges on graduation day to snipe fresh graduates.
This is going to a problem real soon. There needs to be a realistic career path for software developers, or the entire field is at risk.
This. No more juniors, and the skills of the seniors is going to atrophy when all they're doing is review and "move around" code an LLM spat out.
I don't think that's necessarily true. Some companies and seniors certainly invest into training and mentoring their juniors and this can take a long time and be expensive. But there are a lot of places offer 3-6 months internships for people that are barely out of high school. Others heavily rely on overseas contractors. So in a lot of scenarios it does seem possible to use less capable labor in a beneficial way for software engineering.
Hundreds of comments. Some say LLMs are the future. Others say they don't work today and they won't work tomorrow.
Videogame speed running has this problem solved. Livestream your 10x engineer LLM usage, a git commit annotated with it's prompt per change. Then everyone will see the result.
This doesn't seem like an area of debate. No complicated diagrams required. Just run the experiment and show the result.
I literally do this about twice a week on Twitch. I’m working on some hobby projects, and one constraint I’ve put on myself is to use LLMs for pretty much everything, regardless of whether it saves me time or not. The reason is twofold: I want to learn how to use them efficiently, and I want to constantly test the waters to see where the limits of their capabilities are. On my stream, you can see LLMs fail spectacularly one time and do hours of work in minutes another time.
I’m not alone in this - there are tons of other examples of people showing how they use LLMs online; you just need to search for them.
Thanks. I'll give them a watch.
Agreed.
The article provides zero measurement, zero examples, zero numbers.
It's pure conjecture with no data or experiment to back it up. Unfortunately conjecture rises to the top on hackernews. A well built study on LLM effectiveness would fall off the front page quickly.
I'd honestly love to see this.
People always say "you just need to learn to prompt better" without providing any context as to what "better" looks like. (And, presumes that my prompt isn't good enough, which maybe it is maybe it isn't.)
The easy way out of that is "well every scenario is different" - great, show me a bunch of scenarios on a speed run video across many problems, so I can learn by watching.
It's because you get to the No True Scotsman -thing pretty fast.
If I use LLMs to code, say a Telegram bot that summarise the family calendars and current weather to a channel - someone will come in saying "but LLMs are shit because they can't handle this very esoteric hardware assembler I use EVERY DAY!!1"
3 replies →
The Primeagen did this. He locked himself in for 7 weeks and tried to build a game by only using cursor. The videos are more entertainment style but this is the closest thing i know of
https://www.youtube.com/watch?v=6x3weFV9YI4
AI Coding is becoming an edge, and sharing your edge isn't the wisest thing to do, even more so when doubt is so prevalent!
Extremely doubtful.
This thread has hundreds of comments where people are screaming that everyone needs to learn AI coding.
If it was such an edge would they not otherwise keep quiet?
8 replies →
Unlikely. Programming in highly collaborative and efficiency is hard to measure. That creates incentives for programmers in competition to typically prioritize advertising their skills by demonstration over maintaining their secret edge. Be it at work or on the internet, if you help others by sharing your techniques you'll make them want to work with you and impress them with how smart you are. If you are keeping it all secret to maintain your edge, people will think of you as unhelpful and they won't know how smart you are, because it's very difficult to judge how difficult the things our accomplished were. The reason people don't stream themselves vibe coding is that's it's even less interesting to watch than regular coding.
It's not your edge anymore. It's AI's. And how is TFA any different?
Someone will always be idealistic enough to share. The fact that we do not see them now should raise a few eyebrows.
I must disagree. Sharing your edge is the wisest possible thing you can do on a societal level. For a slightly silly idea would it be better to have say, everyone doing guesswork knots for how to tie their shoes vs a single reliable 'rabbit ears' technique? Then you can see the benefits to having edges widely shared as a norm. That is the foundation of how society can learn.
Just like blockchain, smart contracts and web 3.0 are the future!
I see all the negative responses, but this seems true to me. I am old enough to remember the dot com days and could see the transformative effect of the Internet from miles away when I was a teenager. Yet many, many people refused to acknowledge that someday soon we would do things like credit card transactions online, or that people might buy shoes without trying them on first, etc.
You could say it is a lack of imagination or not connecting the dots, but I think there is a more human reason. A lot of people don't want the disruption and are happy with the status quo. I'm a software engineer so I know how problematic AI may be for my job, but I think anyone who looks at our current state and the recent improvements should be able to see the writing on the wall here.
I for one am more curious than afraid of AI, because I have always felt that writing code was the worst part of being a programmer. I am much happier building product or solving interesting problems than tracking down elusive bugs or refactoring old codebases.
1 reply →
So, programmers once had an edge in having their source closed, then fell for the open source evangelism and started sharing their code, which enabled the training of AI models, and now the next iteration of what was called programmers before and is now known as vibe coders has this notion of having an edge in having their chatbot prompts closed again?
Let's all just muse some and imagine what the next cycle of this wheel will look like.
I have one very specific retort to the 'you are still responsible' point. High school kids write lots of notes. The notes frequently never get read, but the performance is worse without them: the act of writing them embeds them into your head. I allegedly know how to use a debugger, but I haven't in years: but for a number I could count on my fingers, nearly every bug report I have gotten I know exactly down to the line of code where it comes from, because I wrote it or something next to it (or can immediately ask someone who probably did). You don't get that with AI. The codebase is always new. Everything must be investigated carefully. When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it. When humans do not do the work, humans do not accrue the experience. (This may still be a good tradeoff, I haven't run any numbers. But it's not such an obvious tradeoff as TFA implies.)
I have to completely agree with this and nobody says this enough.
This tradeoff of unfamiliarity with the codebase is a very well understood problem for decades. Maintaining a project is 99% of the time spent on a successful project.
In my opinion though, having AI write the initial code is just putting most people in a worse situation with almost no upside long term.
I agree I'm bullish on AI for coding generally, but I am curious how they'd get around this problem. Even if they can code at super human level, then you just get rarer super human bugs. Or is another AI going to debug it? Unless this loop is basically fail proof, does the human's job just becoming debugging the hardest things to debug (or at least a blindspot of the AI)
4 replies →
> When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it.
I don’t know. Ever had the experience of looking at 5+ year old code and thinking “what idiot wrote this crap” and then checking “git blame” and realising “oh, I’m the idiot… why the hell did I do this? struggling to remember” - given enough time, humans start to forget why they did things a certain way… and sometimes the answer is simply “I didn’t know any better at the time, I do now”
> You don't get that with AI. The codebase is always new.
It depends on how you use AI… e.g. I will often ask an AI to write me code to do X because it gets me over the “hump” of getting started… but now this code is in front of me on the screen, I think “I don’t like how this code is written, I’m going to refactor it…” and by the time I’m done it is more my code than the AI’s
Understanding code takes more effort than writing it, somehow. That's always been a huge problem in the industry, because code you wrote five years ago was written by someone else, but AI coding takes that from "all code in your org except the code you wrote in the past couple years" to "all code was written by someone else".
How well does your team work when you can't even answer a simple question about your system because nobody wrote, tested, played with the code in question?
How do you answer "Is it possible for our system to support split payments?" when not a single member of your team has even worked on the billing code?
No, code reviews do not familiarize an average dev to the level of understanding the code in question.
1 reply →
Oddly, I don't tend to get that experience very much. More often, it's "That's not how I'd naively write that code, there must be some catch to it. If only I had the foresight to write a comment about it..." Alas, I'm still not very good at writing enough comments.
1 reply →
> why the hell did I do this? struggling to remember
- git blame
- always write good commit messages
5 years? You’re a genius. I can’t make sense of stuff I wrote last week.
Similar to almost self driving cars where you are still responsible. You're asking someone to do nothing at all other than being highly alert for long periods of time. That's just not how people work. There is no way someone can be ready to take over in an instant without actively engaging in the driving.
I read a study[1] (caveat, not peer reviewed yet I don't think?) that seems to imply that you are correct.
[1]: https://www.microsoft.com/en-us/research/wp-content/uploads/...
This is a good point I think, and these steps take time and should definitely be done. I'm not sure people take this into account when talking about having AI code for them.
So do the thing that a student copying their notes from the board does: look at the PR on one monitor, and write your own equivalent PR by typing the changes line-for-line into your IDE on the other. Pretend copy/paste doesn’t exist. Pretend it’s code you saw in a YouTube video of a PowerPoint presentation, or a BASIC listing from one of those 1980s computing magazines.
(And, if you like, do as TFA says and rephrase the code into your own house style as you’re transcribing it. It’ll be better for it, and you’ll be mentally parsing the code you’re copying at a deeper level.)
You still didn't have to build the mental model, understand the subtle tradeoffs and make the decisions that arrived at that design.
I'm amazed that people don't see this. Absolutely nobody would claim that copying a novel is the same thing as writing a novel.
4 replies →
This is how a (video game) programming class in my high school was taught. You had to transcribe the code from a Digipen book.... then fix any broken code. Not entirely sure if their many typos were intentional, but they very much helped learn because we had no choice but to correct their logic failures and taypos to move onto the next section. I'm still surprised 20 years later how well that system worked to teach and push us to branch our understandings.
1 reply →
Is this just repeating labor? Why not just write it all yourself in the first place if you are just going to need to copy it over later?
> The codebase is always new. Everything must be investigated carefully.
That's dreadful. Not only is familiarity with the code not valued, it is impossible to build for your own sake/sanity.
100%, i had gemini write code for a blog in golang - it has some bugs and it took me a some time to find them.
To me the sweet spot is, i write the code with the "Help" of an LLM. It means i double check everything it generates and prompt it to write code block by block - frequently acting as an editor.
Either you want human intervention for correctness and extension or you don't. Having LLM's write large swaths of code is like completely relying on tesla's autopilot - you are probably more stressed than if you just drove yourself.
The careful vetting of code and thoroughly testing it is super important, I would never even think of putting any generated code into any use without doing that.
Also your last comparison made me chuckle, good one :)
+1
Writing code is easier than long term maintenance. Any programmer is able to write so much code that he will not be able to maintain it. Unless there are good AI tools helping with maintenance there is no point to use generative tools for production code. From my experience AI tools are great for prototyping or optimizing procrastination.
This level of knowledge is nearly impossible to maintain as the codebase grows though, beyond one or two people at a typical company. And tools need to exist for the new hire as well as the long-standing employee.
Welcome to project architecting, where the job isn't about putting more lines of code into this world, but more systems in place to track them. A well layered and structured codebase can grow for a very long time before it becomes too hard to maintain. And generally, the business complexity bites before the algorithmic one, and there's no quick fix for that.
1 reply →
Talking to an LLM feels like talking to Leonard Shelby from memento. https://youtube.com/watch?v=Y3kNTvXVHvQ
AI tools are enabling the developer to a higher level of abstraction to engineering manager or product manager. Those roles do not need to be familiar with code in that detail.
This is it. Reading AI slop does not form synapses in your brain like writing the code yourself does.
Exactly. See also https://hazelweakly.me/blog/stop-building-ai-tools-backwards... for a detailed look at this aspect of AI coding.
[flagged]
It might do it later... but odds are it'll requiring ingesting a post a human made about the exact problem while listing out what the correct output should have been.
1 reply →
The important thing you are missing is that the learning landscape has now changed.
You are now responsible for learning how to use LLMs well. If an untrained vibe coder is more productive for me, while knowing nothing about how the code actually works, I will hire the vibe coder instead of you.
Learning is important, but it's most important that you learn how to use the best tools available so you can be productive. LLMs are not going away and they will only get better, so today that means you are responsible for learning how to use them, and that is already more important for most many roles than learning how to code yourself.
This is actually a good reason for exiting the industry before one's job goes away. Steering AI to barf up the right-looking pool of vomit is not the Flow-generating experience that many people have started to program for.
1 reply →
Title nitpick: The amount is people who care about AI for coding assistance is a relative minority. For everyone else, there's 'AI', which has a huge branding problem. 'AI' is filling all search results with trash, and creating trash websites full of trash to fill up the rest of the search results. It's generating trash images to put at the top of every blog post. It's eating up all the server load with scraping. It's what's been fooling my dad every day on Facebook.
When people are sick of AI, this is what they are talking about. AI hype people ignore this perspective each and every time.
It doesn't matter how great your paper Mill's paper is, if you're dumping PCBs in the river, people are going to quite rightly get pissed off.
AI has a huge branding problem on HN but I haven’t found that to be true outside of tech.
Many people are very eager to see how AI can improve their lives.
I’m building a product for accountants that uses AI, initially I avoided using AI as a part of marketing copy. Turns out, people are much more interested when I tell them it uses AI behind the scenes.
It actually varies by segment. The least technical people are enthusiastic about AI. The moderately technical (e.g tech hobbyists, artists and power users) are repulsed. The segment of tech professionals are themselves divided, but I think based on how much they like/hate their career as programmers. Either way, what’s very different about this AI bubble is that the early adopters are predominantly the least technical users. This is a notable inversion of the typical adoption curve.
It's been so much more rewarding playing with AI coding tools on my own than through the subtle and not so subtle nudges at work. The work AI tools are a walled garden, have a shitty interface, feel made to extract from me than to help me. In my personal stuff, downloading models, playing with them, the tooling, the interactions, it all been so much more rewarding to give me stable comfortable workflows I can rely on and that work with my brain.
The dialog around it is so adversarial it's been hard figuring out how to proceed until dedicating a lot of effort to diving into the field myself, alone, on my personal time and learned what's comfortable to use it on.
Exactly, seems much skepticism comes from only scratching the surface of what's possible.
Is there a term for “skeptics just haven’t used it enough” argument?
Because it frequently got rolled out in crypto-currency arguments too.
7 replies →
The author defends mediocore code, yet wrote this piece:
https://fly.io/blog/vscode-ssh-wtf/
Where he dunks on how SSH access works in VSCode. I don't know. The code and architecture behind this feature may well be bananas, but gets the work done. Sounds like a clear case of mediocority. I wonder how does he reconcile those two articles together.
For me this is more of a clickbait. Both of the articles. With that in mind, if I am nuts for being sceptical of LLMs, I think it is fair to call the author a clickbaiter.
How a feature works, and how a feature is coded (how "nice" & canonical the code is) are two very different things.
He's dunking on the design of it.
Hmm. The author is a cryptography expert. So, honest question: Would he trust crypto code written by an AI?
(I suspect I know the answer, but I am not certain of it.)
The whole point of crypto is you don't trust code written by anyone, even yourself.
the AI is going to use crypto libraries just like you would?
I agree with the main take in this article: the combination of agents + LLMs with large context windows + a large budget of tokens to iterate on problems can probably already yield some impressive results.
I take serious issue with the "but you have no idea what the code is" rebuttal, since it - to me - skims over the single largest issue with applying LLMs anywhere where important decisions will be made based on their outputs.
To quote from the article:
The problem here is that LLMs are optimized to make their outputs convincing. The issue is exactly "whether you can make sense of the result", as the author said, or, in other words: whether you're immune to being conned by a model output that sounds correct but is not. Sure, "reading other people’s code is part of the job", but the failure modes of junior engineers are easily detectable. The failure modes of LLMs are not.
EDIT: formatting
It's also funny how it requires a lot of iterations for the average task.. and the user has to pay for the failures. No other product has this expectation, imagine a toaster that only toasts bread 20% of the time, and 50% it's half toasted.
> The problem here is that LLMs are optimized to make their outputs convincing.
That may be true for chat aligned LLMs, but coding LLMs are trained w/ RL and rewards for correctness, nowadays. And there are efforts to apply this to the entire stack (i.e. better software glue, automatic guardrails, more extensive tool-use, access to LSP/debuggers/linters, etc).
I think this is the critical point in a lot of these debates that seem to be very popular right now. A lot of people try something and get the wrong impressions about what SotA is. It turns out that often that something is not the best way to do it (i.e. chatting in a web interface for coding), but people don't go the extra mile to actually try what would be best for them (i.e. coding IDEs, terminal agents, etc).
Which "coding LLMs" are you referring to that are trained purely on verifiably correct synthetic data? To my understanding o3, gemini 2.5 pro, claude 3.7 sonnet, etc. are all still aligned to human preferences using a reward function learned from human feedback. Any time a notion of success/correctness is deferred to a human, the model will have a chance to "game" the system by becoming more convincing as well as more correct.
Edit: thought I would include this below instead of in a separate comment:
Also, whether the models are trained purely on synthetic data or not, they suffer from these epistemological issues where they are unable to identify what they don't know. This means a very reasonable looking piece of code might be spit out for some out-of-distribution prompt where the model doesn't generalize well.
2 replies →
The argument that I've heard against LLMs for code is that they create bugs that, by design, are very difficult to spot.
The LLM has one job, to make code that looks plausible. That's it. There's no logic gone into writing that bit of code. So the bugs often won't be like those a programmer makes. Instead, they can introduce a whole new class of bug that's way harder to debug.
This is exactly what I wrote about when I wrote "Copilot Induced Crash" [0]
Funny story: when I first posted that and had a couple of thousand readers, I had many comments of the type "you should just read the code carefully on review", but _nobody_ pointed out the fact that the opening example (the so called "right code") had the exact same problem as described in the article, proving exactly what you just said: it's hard to spot problems that are caused by plausibility machines.
[0] https://www.bugsink.com/blog/copilot-induced-crash/
If it crashes, you are very lucky.
AI generated code will fuck up so many lives. The post office software in the UK did it without AI. I cannot imagine the way and the number of lives will be ruined since some consultancy vibe coded some government system. I might come to appreciate the German bureaucracy and backwardness.
My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.
That rather depends on the type of bug and what kinds of tests you would write.
LLMs are way faster than me at writing tests. Just prompt for the kind of test you want.
1 reply →
> My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.
Maybe use one LLMs to write the code and a wildly different one to write the tests and yet another wildly different one to generate an English description of each test while doing critical review.
Disagree. You could write millions of tests for a function that simply sums two numbers, and it’s trivial to insert bugs while passing that test.
This is pretty nifty, going to try this out!
I don't agree. What I do agree on is to do it not only with one LLM.
Quality increases if I double check code with a second LLM (especially o4 mini is great for that)
Or double check tests the same way.
Maybe even write tests and code with different LLMs if that is your worry.
Yes, exactly - my (admittedly very limited!) experience has consistently generated well-written, working code that just doesn’t quite do what I asked. Often the results will be close to what I expect, and the coding errors do not necessarily jump out on a first line-by-line pass, so if I didn’t have a high degree of skepticism of the generated code in the first place, I could easily just run with it.
> working code that just doesn’t quite do what I asked
Code that doesn't do what you want isn't "working", bro.
Working exactly to spec is the code's only job.
1 reply →
For me it's mostly about the efficiency of the code they write. This is because I work in energy where efficiency matters because our datasets are so ridicilously large and every interface to that data is so ridicilously bad. I'd argue that for 95% of the software out there it won't really matter if you use a list or a generator in Python to iterate over data. It probably should and maybe this will change with cloud costs continious increasing, but we do also live in a world where 4chan ran on some apache server running a 10k line php file from 2015...
Anyway, this is where AI's have been really bad for us. As well as sometimes "overengineering" their bug prevention in extremely inefficient ways. The flip-side of this is of course that a lot of human programmers would make the same mistakes.
I’ve had the opposite experience. Just tell it to optimise for speed and iterate and give feedback. I’ve had JS code optimised specifically for v8 using bitwise operations. It’s brilliant.
2 replies →
>Instead, they can introduce a whole new class of bug that's way harder to debug
That sounds like a new opportunity for a startup that will collect hundreds of millions a of dollars, brag about how their new AI prototype is so smart that it scares them, and devliver nothing
> There's no logic gone into writing that bit of code.
What makes you say that? If LLMs didn't reason about things, they wouldn't be able to do one hundredth of what they do.
This is a misunderstanding. Modern LLMs are trained with RL to actually write good programs. They aren't just spewing tokens out.
No, YOU misunderstand. This isn't a thing RL can fix
It doesn't optimize "good programs". It interprets "humans interpretation of good programs." More accurately, "it optimizes what low paid over worked humans believe are good programs." Are you hiring your best and brightest to code review the LLMs?
Even if you do, it still optimizes tricking them. It will also optimize writing good programs, but you act like that's a well defined and measurable thing.
35 replies →
"Good" is the context of LLMs means "plausible". Not "correct".
If you can't code then the distinction is lost on you, but in fact the "correct" part is why programmers get paid. If "plausible" were good enough then the profession of programmer wouldn't exist.
2 replies →
They are also trained with RL to write code to pass unit tests and Claude does have a big problem with trying to cheat the test or request pretty quickly after running into issues, making manual edit approval more important. It usually still tells what it is trying to do wrong so you can often find out from its summary before having to scan the diff.
This can happen, but in practice, given I'm reviewing every line anyway, it almost never bites me.
The argument seems to be that for an expert programmer, who is capable of reading and understanding AI agent code output and merging it into a codebase, AI agents are great.
Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
The expert skills needed to be an editor -- reading code, understanding its implications, knowing what approaches are likely to cause problems, recognizing patterns that can be refactored, knowing where likely problems lie and how to test them, holding a complex codebase in memory and knowing where to find things -- currently come from long experience writing code.
But a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?
I think of this because of my job as a professor; many of the homework assignments we use to develop thinking skills are now obsolete because LLMs can do them, permitting the students to pass without thinking. Perhaps there is another way to develop the skills, but I don't know what it is, and in the mean time I'm not sure how novices will learn to become experts.
> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
Well, if everyone uses a calculator, how do we learn math?
Basically, force students to do it by hand long enough that they understand the essentials. Introduce LLMs at a point similar to when you allow students to use a calculator.
> Well, if everyone uses a calculator, how do we learn math?
Calculators have made most people a lot worse in arithmetic. Many people, for instance, don't even grasp what a "30%" discount is. I mean other than "it's a discount" and "it's a bigger discount than 20% and lower than 40%". I have seen examples where people don't grasp that 30% is roughly one third. It's just a discount, they trust it.
GPS navigation has made most people a lot worse at reading maps or generally knowing where they are. I have multiple examples where I would say something like "well we need to go west, it's late in the day so the sun will show us west" and people would just not believe me. Or where someone would follow their GPS on their smartphone around a building to come back 10m behind where they started, without even realising that the GPS was making them walk the long way around the building.
Not sure the calculator is a good example to say "tools don't make people worse with the core knowledge".
54 replies →
If calculators were unreliable... Well, we'd be screwed if everyone blindly trusted them and never learned math.
They'd also be a whole lot less useful. Calculators are great because they always do exactly what you tell them. It's the same with compilers, almost: imagine if your C compiler did the right thing 99.9% of the time, but would make inexplicable errors 0.1% of the time, even on code that had previously worked correctly. And then CPython worked 99.9% of the time, except it was compiled by a C compiler working 99.9% of the time, ...
But bringing it back on-topic, in a world where software is AI-generated, and tests are AI-generated (because they're repetitive, and QA is low-status), and user complaints are all fielded by chat-bots (because that's cheaper than outsourcing), I don't see how anyone develops any expertise, or how things keep working.
2 replies →
While I agree with your suggestion, the comparison does not hold: calculators do not tell you which numbers to input and compute. With an LLM you can just ask vaguely, and get an often passable result
1 reply →
> a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?
Well, if you’re a novice, don’t do that. I learn things from LLMs all the time. I get them to solve a problem that I’m pretty sure can be solved using some API that I’m only vaguely aware of, and when they solve it, I read the code so I can understand it. Then, almost always, I pick it apart and refactor it.
Hell, just yesterday I was curious about how signals work under the hood, so I had an LLM give me a simple example, then we picked it apart. These things can be amazing tutors if you’re curious. I’m insatiably curious, so I’m learning a lot.
Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.
> Well, if you’re a novice, don’t do that.
I agree, and it sounds like you're getting great results, but they're all going to do it. Ask anyone who grades their homework.
Heck, it's even common among expert users. Here's a study that interviewed scientists who use LLMs to assist with tasks in their research: https://doi.org/10.1145/3706598.3713668
Only a few interviewees said they read the code through to verify it does what they intend. The most common strategy was to just run the code and see if it appears to do the right thing, then declare victory. Scientific codebases rarely have unit tests, so this was purely a visual inspection of output, not any kind of verification.
> Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.
Except it's impossible to follow your curiosity when everything in the world is pushing against it (unless you are already financially independent and only programming for fun). Junior developers compete in one of the most brutal labor markets in the world, and their deliverables are more about getting things done on time than doing things better. What they "should" do goes out the window once you step out of privilege and look at the real choices.
You sound like an active learner who could become a top programmer even without LLMs. Most students will take the path of least resistance.
3 replies →
This reminds me of Isaac Asimov's "Profession" short story. Most people receive their ability (and their matching assigned profession, thus the name) from a computer. They then are able to do the necessary tasks for their job, but they can't advance the art in any way. A few people aren't compatible with this technology, and they instead learn to do things themselves, which is fortunate because it's the only way to advance the arts.
Deliberate practice, which may take a form different from productive work.
I believe it's important for students to learn how to write data structures at some point. Red black trees, various heaps, etc. Students should write and understand these, even though almost nobody will ever implement one on the job.
Analogously electrical engineers learn how to use conservation laws and Ohm's law to compute various circuit properties. Professionals use simulation software for this most of the time, but learning the inner workings is important for students.
The same pattern is true of LLMs. Students should learn how to write code, but soon the code will write itself and professionals will be prompting models instead. In 5-10 years none of this will matter though because the models will do nearly everything.
I agree with all of this. But it's already very difficult to do even in a college setting -- to force students to get deliberate practice, without outsourcing their thinking to an LLM, you need various draconian measures.
And for many professions, true expertise only comes after years on the job, building on the foundation created by the college degree. If students graduate and immediately start using LLMs for everything, I don't know how they will progress from novice graduate to expert, unless they have the self-discipline to keep getting deliberate practice. (And that will be hard when everyone's telling them they're an idiot for not just using the LLM for everything)
You're talking about students, but the question was about seniors. You don't go to school to become a senior dev, you code in real-world settings, with real business pressures, for a decade or two to become a senior. The question is how are decent students supposed to grow into seniors who can independently evaluate AI-produced code if they are forced to use the magic box and accept its results before being able to understand them?
1 reply →
> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?
LLMs are very much like pair programmers in my experience. For the junior engineer, they are excellent resources for learning, the way a senior engineer might be. Not only can they code what the junior can’t, they can explain questions the junior has about the code and why it’s doing what it’s doing.
For senior devs, it is a competent pair programmers, acting as an excellent resource for bouncing ideas off of, rubber ducking, writing boilerplate, and conducting code reviews.
For expert devs, it is a junior/senior dev you can offload all the trivial tasks to so you can focus on the 10% of the project that is difficult enough to require your expertise. Like a junior dev, you will need to verify what it puts together, but it’s still a huge amount of time saved.
For junior devs specifically, if they are not curious and have no interest in actually learning, they will just stop at the generated code and call it a day. That’s not an issue with the tool, it’s an issue with the dev. For competent individuals with a desire to learn and grow, LLMs represent one of the single best resources to do so. In that sense, I think that junior devs are at a greater advantage than ever before.
> That’s not an issue with the tool, it’s an issue with the dev.
Hard disagreeing here. It's a difference to work on a task because you feel it brings you tangible progress or because it's an artificial exercise that you could really do with one sentence to Claude if it weren't for the constraints of the learning environment. This feeling is actually demotivating for learning.
I don’t know about you, but I use LLMs as gateways to knowledge. I can set a deep research agent free on the internet with context about my current experience level, preferred learning format (books), what I’m trying to ramp up on, etc. A little while later, I have a collection of the definitive books for ramping up in a space. I then sit down and work through the book doing active recall and practice as I go. And I have the LLM there for Q&A while I work through concepts and “test the boundaries” of my mental models.
I’ve become faster at the novice -> experienced arc with LLMs, even in domains that I have absolutely no prior experience with.
But yeah, the people who just use LLMs for “magic oracle please tell me what do” are absolutely cooked. You can lead a horse to water, but you can’t make it drink.
If no one really becomes an expert anymore, that seems like great news for the people who are already experts. Perhaps people actively desire this.
Problem is, at some point those experts retire or change their focus and you end up with COBOL problem.
Except instead of just one language on enterprise systems no one wants to learn because there is no money in them, it's everything.
1 reply →
It’s a great point and one I’ve wondered myself.
Arguments are made consistently about how this can replace interns or juniors directly. Others say LLMs can help them learn to code.
Maybe, but not on your codebase or product and not with a seniors knowledge of pitfalls.
I wonder if this will be programmings iPhone moment where we start seeing a lack of deep knowledge needed to troubleshoot. I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.
> I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.
And as with Google and Stack Overflow before, the Sr Devs will smack the wrists of the Jr's that commit untested and unverified code, or said Jr's will learn not to do those things when they're woken up at 2 AM for an outage.
1 reply →
The way the responses to this subthread show the classical "the problem doesn't exist - ok, it does exist but it's not a big deal - ok, it is a big deal but we should just adapt to it" progression makes me wonder if we found one of the few actually genuine objections to LLM coding.
Nail on head. Before, innovations in code were extensions of a human's capabilities. The LLM-driven generation could diminish the very essence of writing meaningful code, to the point where they will live in the opposite of a golden era. The dead internet theory may yet prevail.
I think a large fraction of my programming skills come from looking through open source code bases. E.g. I'd download some code and spend some time navigating through files looking for something specific, e.g. "how is X implemented?", "what do I need to change to add Y?".
I think it works a bit like pre-training: to find what you want quickly you need to have a model of coding process, i.e. why certain files were put into certain directories, etc.
I don't think this process is incompatible with LLM use...
If I were a professor, I would make my homework start the same -- here is a problem to solve.
But instead of asking for just working code, I would create a small wrapper for a popular AI. I would insist that the student use my wrapper to create the code. They must instruct the AI how to fix any non-working code until it works. Then they have to tell my wrapper to submit the code to my annotator. Then they have to annotate every line of code as to why it is there and what it is doing.
Why my wrapper? So that you can prevent them from asking it to generate the comments, and so that you know that they had to formulate the prompts themselves.
They will still be forced to understand the code.
Then double the number of problems, because with the AI they should be 2x as productive. :)
For introductory problems, the kind we use to get students to understand a concept for the first time, the AI would likely (nearly) nail it on the first try. They wouldn't have to fix any non-working code. And annotating the code likely doesn't serve the same pedagogical purpose as writing it yourself.
Students emerge from lectures with a bunch of vague, partly contradictory, partly incorrect ideas in their head. They generally aren't aware of this and think the lecture "made sense." Then they start the homework and find they must translate those vague ideas into extremely precise code so the computer can do it -- forcing them to realize they do not understand, and forcing them to make the vague understanding concrete.
If they ask an AI to write the code for them, they don't do that. Annotating has some value, but it does not give them the experience of seeing their vague understanding run headlong into reality.
I'd expect the result to be more like what happens when you show demonstrations to students in physics classes. The demonstration is supposed to illustrate some physics concept, but studies measuring whether that improves student understanding have found no effect: https://doi.org/10.1119/1.1707018
What works is asking students to make a prediction of the demonstration's results first, then show them. Then they realize whether their understanding is right or wrong, and can ask questions to correct it.
Post-hoc rationalizing an LLM's code is like post-hoc rationalizing a physics demo. It does not test the students' internal understanding in the same way as writing the code, or predicting the results of a demo.
> They will still be forced to understand the code.
But understanding is just one part of the learning process, isn't it? I assume everybody has had this feeling: the professor explains maths on the blackboard, and the student follows. The students "understands" all the steps: they make sense, they don't feel like asking a question right now. Then the professor gives them an exercise slightly different and asks to do the same, and the students are completely lost.
Learning is a loop: you need to accept it, get it in your memory (learn stuff by heart, be it just the vocabulary to express the concepts), understand it, then try to do it yourself. Realise that you missed many things in the process, and start at the beginning: learn new things by heart, understand more, try it again.
2 replies →
Yep, this is the thing I worry about as well.
I find these tools incredibly useful. But I constantly edit their output and frequently ask for changes to other peoples' code during review, some of which is AI generated.
But all of that editing and reviewing is informed by decades of writing code without these tools, and I don't know how I would have gotten the reps in without all that experience.
So I find myself bullish on this for myself and the experienced people I work with, but worried about training the next generation.
Yes I feel the same way. But I worry about my kids. My 15-year old son wanted to go into software engineering and work for a game studio. I think I'll advocate civil engineering, but for someone who will still be working 50 years from now its really hard to know what will be a good field right now.
1 reply →
> So where will the experts come from?
They won't, save for a relative minority of those who enjoy doing things the hard way or those who see an emerging market they can capitalize on (slop scrubbers).
I wrote this post [1] last month to share my concerns about this exact problem. It's not that using AI is bad necessarily (I do every day), but it disincentivizes real learning and competency. And once using AI is normalized to the point where true learning (not just outcome seeking) becomes optional, all hell will break loose.
> Perhaps there is another way to develop the skills
Like sticking a fork in a light socket, the only way to truly learn is to try it and see what happens.
[1] https://ryanglover.net/blog/chauffeur-knowledge-and-the-impe...
LLMs are also great to ask questions about existing code. It's like being able to converse with StackOverflow.
I dont know if im convinced by this. Like if we were talking about novels, you don't have to be a writer to check grammar and analyze plot structure in a passable way. It is possible to learn by reading instead of doing.
Sure, you could learn about grammar, plot structure, narrative style, etc. and become a reasonable novel critic. But imagine a novice who wants to learn to do this and has access to LLMs to answer any question about plots and style that they want. What should they do to become a good LLM-assisted author?
The answer to that question is very different from how to become an author before LLMs, and I'm not actually sure what the answer is. It's not "write lots of stories and get feedback", the conventional approach, but something new. And I doubt it's "have an LLM generate lots of stories for you", since you need more than that to develop the skill of understanding plot structures and making improvements.
So the point remains that there is a step of learning that we no longer know how to do.
I've had a lot of success using LLMs to deepen my understanding of topics. Give them an argument, and have them give the best points against it. Consider them, iterate. Argue against it and let it counter. It's a really good rubber duck
> The expert skills... currently come from long experience writing code
Do they? Is it the writing that's important? Or is it the thinking that goes along with it? What's stopping someone from going through LLM output, going back and forth on design decisions with the LLM, and ultimately making the final choice of how the tool should mold the codebase after seeing the options
I mean of course this requires some proactive effort on your part.. but it always did
The key point I think though is to not outsource your thinking. You can't blindly trust the output. It's a modern search engine
I think it's the writing.
I learned long ago that I could read a book, study it, think about it. And I still would really master the material until I built with it.
> If everyone uses AI to code, how does someone become an expert
The same way they do now that most code is being copied/integrated from StackOverflow.
I had this conversation with a friend:
HIM: AI is going to take all entry level jobs soon. ME: So the next level one up will become entry level? HIM: Yes. ME: Inductively, this can continue up to the CEO. What about the CEO? HIM. Wait...
1 reply →
This is such a non issue and so far down the list of questions. Weve invented AI that can code, and you're asking about career progression? Thats the the top thing to talk about? Weve given life to essentially an alien life form
"What is this going to do to humans?" is probably the #1 question that should be on the mind of every engineer, every day. Being toolmakers for civilization is the entire point of our profession.
I'll take the opposite view of most people. Expertise is a bad thing. We should embrace technological changes that render expertise economically irrelevant with open arms.
Take a domain like US taxation. You can certainly become an expert in that, and many people do. Is it a good thing that US taxes are so complicated that we have a market demand for thousands of such experts? Most people would say no.
Don't get my wronf, I've been coding for more years of being alive than I haven't by this point, I love the craft. I still think younger me would have far preferred a world where he could have just had GPT do it all for him so he didn't need to spend his lunch hours poring over the finer points of e.g. Python iterators.
By the same logic we should allow anyone with an LLM to design ships, bridges, and airliners.
Clearly, it would be very unwise to buy a bridge designed by an LLM.
It's part of a more general problem - the engineering expectations for software development are much lower than for other professions. If your AAA game crashes, people get annoyed but no one dies. If your air traffic control system fails, you - and a large number of other poeple - are going to have a bad day.
The industry that has a kind of glib unseriousness about engineering quality - not theoretical quality, based on rules of thumb like DRY or faddy practices, but measurable reliability metrics.
The concept of reliability metrics doesn't even figure in the LLM conversation.
That's a very bizarre place to be.
> We should embrace technological changes that render expertise economically irrelevant with open arms.
To use your example, is using AI to file your taxes actually "rendering [tax] expertise economically irrelevant?" Or is it just papering over the over-complicated tax system?
From the perspective of someone with access to the AI tool, you've somewhat eased the burden. But you haven't actually solved the underlying problem (with the actual solution obviously being a simpler tax code). You have, on the other hand, added an extra dependency on top of an already over-complicated system.
4 replies →
The question then becomes whether or not it's possible (or will be possible) to effectively use these LLMs for coding without already being an expert. Right now, building anything remotely complicated with an LLM, without scouring over every line of code generated, is not possible.
Counter-counter point. The existence of tools like this can allow the tax code to become even more complex.
2 replies →
Don't think of it from someone who had to learn. Think of it from someone who has never had the experience the friction of learning at all.
But that is incompatible with the fact that you need be an expert to wield this tool effectively.
There's something that seems to be missing in all these posts and that aligns with my personal experience trying to use AI coding assistants.
I think in code.
To me, having to translate the into natural language for the LLM to translate it back into code makes very little sense.
Am I alone in this camp? What am I missing?
If you think in code, try prompting them in code.
I quite often prompt with code in a different language, or pseudo-code describing roughly what I am trying to achieve, or a Python function signature without the function body.
Or I will paste in a bunch of code I have already written with a comment somewhere that says "TODO: retrieve the information from the GitHub API" and have the model finish it for me.
> a Python function signature without the function body.
This, and for multiple functions that end up composing well together as per their signatures. Maybe there's one public function I want to document well, so I write the docstring myself, and it's the result for 3-4 other internal functions which I'd let the LLM implement.
The nice part is that even if the LLM fails, all that is not lost, as opposed to some weird spec I'd feed an LLM but that's too verbose for a human reader, or a series of prompts.
I'm also in this camp, and that's why it does not work for me.
Natural language is just a terrible interface and fundamentally not an appropriate one to communicate with a computer.
I wonder if I'm in the minority here because I'm neurodivergent.
No. Thanks for pointing this out. It took quite some time to understand and express this to my management. Also I use a very small set of libraries, with function names / parameters that are quite explicit. And I don't need to specify some style to bypass hidden prompts, and bad coding practices.
I am pretty sure we will see programming languages that are custom made for AI popping up soon. I cannot predict how it will look like, but it may be a mix between an extremely well documented language with lots of safeguards (like Kotlin or Java) combined with natural language like instructions.
So, basically, you envision that we will eventually add a type system to English?
I know what you mean. The thing is: if you already have the solution put together in your mind, it might be faster to just implement it by hand.
But if you don't have the shape of a solution? Might be faster to have an AI find it. And then either accept AI's solution as is, or work off it.
Curious how he reconciles this:
> If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style.
with Joel Spolsky's fundamental maxim:
> It’s harder to read code than to write it.
https://www.joelonsoftware.com/2000/04/06/things-you-should-...
I'm sure no one reads LLM-produced code very carefully, especially if it "works". At best they skim.
The problem is that a lot of code works in general, but fails in edge cases. I would hate to be the guy who's job is only to find out why verbose AI generated code fails in one particular condition.
I read LLM generated code like I review a PR. I skim for anything that stands out as a common pitfall, and dig into the details of area I expect issues.
For most things I'm not willing to accept faster code at the expense of being an expert in the code.
So I am still trying to find the right amount of reading, editing, and reworking that gets the job done faster, where "the job" includes me being an expert in the produced code, not just the production of code.
There are periods of skimming but I'm doing a lot more than skimming.
This is my biggest problem with LLM generated code.
But I'm sure the author will simply dismiss it as a skill issue too.
The major difference is that in the type of reading Joel Splosky is talking about, you are coming in not knowing the code's intent. It was written by one or more other people at some point in the past, likely with many iterative changes over a period of time. Figuring out the intent in this case is 90%+ of the work. With LLM generated code, you know the intent. You just told the assistant exactly what your intent was. It's much, much easier to read code that you already know the intent of.
You can invest more time in the writing to make code easier to read. It’s just that it typically would take much more time. It would be conceivable for LLMs to eventually reach a state where they can produce such easily readable and comprehensible code.
I can reconcile that. Joel is right. So is Thomas. If you're a professional programmer you should be able to do the harder thing.
Thomas was a bit more spicy about that:
> but you have no idea what the code is - Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?
You've said this a couple of times, but I don't understand what you're trying to say.
Yes, I can do frustrating things, I know how to review and vet code. I also know how to write boilerplate code. I also know how to research new tasks in areas where I have no familiarity that are poorly documented. I know how to do a lot of complicated, difficult things - all of which are part of being a professional developer.
The question is whether I want to use a tool that makes most of my job the difficult part.
The only way this makes sense is if you are somehow reaching a point where "doing the harder thing" is not the harder thing for you anymore - where reviewing code is easier for you than writing it. And maybe your argument is that you can get to that point with practice, or that LLM code is generally easier to review than other types of code, or that the problems it's tackling are so difficult that in those cases writing the code is harder than reading it.
But it's not that they're both true. "You should be able to do the harder thing" is not really an answer to "why are you selling me a tool that replaces an easy thing with a hard thing?"
There are many difficult things that I can do as a professional software developer. I can mentor junior developers. I can do detailed product design work with stakeholders and translate technical limitations into language that they understand. I can negotiate software deadlines with product owners. I can write interfaces for undocumented software and deal with undocumented bugs in 3rd-party code. I can step through minified code in production settings to debug problems. These are all difficult things that, as a professional developer, I am capable and willing to do, and often need to do. And yes, of course, I can review pull requests. I am not, however, generally in the business of adopting tools that force me to do that stuff more often than is necessary to get good results. I don't adopt tools that make my life harder, and I honestly think that's a big part of being a professional.
To be very blunt about it: "Buck up, you should be able to handle this" is not a sales pitch. I can also write with my non-dominant hand, but I'm not going to start taking notes that way. There's no intrinsic prize for making your life harder, the only way what you're saying makes sense is if it's not harder for you to read code than to write it.
3 replies →
To quote an excellent article from last week:
> The AI has suggested a solution, but the added code is arguably useless or wrong. There is a huge decision space to consider, but the AI tool has picked one set of decisions, without any rationale for this decision.
> [...]
> Programming is about lots of decisions, large and small. Architecture decisions. Data validation decisions. Button color decisions.
> Some decisions are inconsequential and can be safely outsourced. There is indeed a ton of boilerplate involved in software development, and writing boilerplate-heavy code involves near zero decisions.
> But other decisions do matter.
(from https://lukasatkinson.de/2025/net-negative-cursor/)
Proponents of AI coding often talk about boilerplate as if that's what we spend most of our time on, but boilerplate is a cinch. You copy/paste, change a few fields, and maybe run a macro on it. Or you abstract it away entirely. As for the "agent" thing, typing git fetch, git commit, git rebase takes up even less of my time than boilerplate.
Most of what we write is not highly creative, but it is load-bearing, and it's full of choices. Most of our time is spent making those choices, not typing out the words. The problem isn't hallucination, it's the plain bad code that I'm going to have to rewrite. Why not just write it right myself the first time? People say "it's like a junior developer," but do they have any idea how much time I've spent trying to coax junior developers into doing things the right way rather than just doing them myself? I don't want to waste time mentoring my tools.
No, what's happening here is that you're using a different definition of "boilerplate" than the adopters are using. To you, "boilerplate" is literally a chunk of code you copy and paste to repeatedly solve a problem (btw: I flip my shit when people do this on codebases I work on). To them, "boilerplate" represents a common set of rote solutions to isomorphic problems. The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.
Coming at this from a computer-science or PLT perspective, this idea of an "abstract, repeatable meta-boilerplate" is exactly the payoff we expect from language features like strong type systems. Part of the point of rigorous languages is to create these kinds of patterns. You had total expressiveness back in assembly language! Repeatable rigor is most of the point of modern languages.
> To them, "boilerplate" represents a common set of rote solutions to isomorphic problems.
That's what libraries and frameworks are here for. And that's why no experienced engineers consider those an issue. What's truly important is the business logic, then you find a set of libraries that solves the common use cases and you write the rest. Sometimes you're in some novel space that doesn't have libraries (new programming language), but you still have specs and reference implementation that helps you out.
The actual boilerplate is when you have to write code twice because the language ecosystem don't have good macros à la lisp so you can invent some metastuff for the problem at end. (think writing routers for express.js)
3 replies →
Copy pasting code that could be abstracted is not a usage of boilerplate I've ever encountered, usually it's just a reference to certain verbose languages where you have to write a bunch of repetitive low-entropy stuff to get anywhere, like getters and setters in java classes.
2 replies →
> The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.
Assuming something like "a REST endpoint which takes a few request parameters, makes a DB query, and returns the response" fits what you're describing, you can absolutely copy/paste a similar endpoint, change the parameters and the database query, and rename a couple variables—all of which takes a matter of moments.
Naturally code that is being copy-pasted wholesale with few changes is ripe to be abstracted away, but patterns are still going to show up no matter what.
3 replies →
> solutions to isomorphic problems
“Isomorphic” is a word that describes a mapping (or a transformation) that preserves some properties that we believe to be important.
The word you’re looking for is probably “similar” not “isomorphic”. It sure as hell doesn’t sound as fancy though.
3 replies →
But what do you make of the parent’s second paragraph? This is the way I feel as well - I would rather not spend my time asking AI to do something right that I could just do myself.
I bit the bullet last week and tried to force myself to use a solution built end to end by AI. By the time I’d finished asking it to make changes (about 25 in total), I would’ve had a much nicer time doing it myself.
The thing in question was admittedly partially specified. It was a yaml-based testing tool for running some scenarios involving load tests before and after injecting some faults in the application. I gave it the yaml schema up front, and it did a sensible job as a first pass. But then I was in the position of reading what it wrote, seeing some implicit requirements I’d not specified, and asking for those.
Had I written it myself from the start, those implicit requirements would’ve been more natural to think about in the progression of iterating on the tool. But in this workflow, I just couldn’t get in a flow state - the process felt very unnatural, not unlike how it would’ve been to ask a junior to do it and taking 25 rounds of code review. And that has always been a miserable task, difficult to force oneself to stay engaged with. By the end I was much happier making manual tweaks and wish I’d have written it myself from the start.
I'm firmly convinced at this point that there is just no arguing with the haters. At the same time, it feels like this transition is as inevitable as the transition to mobile phones. LLM's are everywhere, and there's no escaping it no matter how much you might want to.
There's always some people that will resist to the bitter end, but I expect them to be few and far between.
3 replies →
Your article comes across like you think every developer is the exact same as you, very egocentric piece
Not everyone is just cranking out hacked together MVPs for startups
Do you not realize there are many many other fields and domains of programming?
Not everyone has the same use case as you
10 replies →
At least when you mentor an actual junior developer they often learn, and you can take satisfaction in aiding the growth of a human being. All the time and effort spent coaxing an LLM to "do better" either disappears in a puff of smoke the next time it goes schizoid and needs to have its context cleared or- at best- is recorded to help a for-profit company train their next generation of products.
Like everything else about the "GenAI" fad, it boils down to extractively exploiting goodwill and despoiling the commons in order to convert VC dollars into penny-shavings.
Boilerplate is a cinch when you already know what to do.
I work in finance, I have for almost 20 years now. There are things in finance you do once every 5 years, like setting up a data source like Bloomberg in a new programming language. Now you know from the last time you did it that it’s a pain, you need to use a very low level api, handling all the tiny messages yourself, building up the response as it comes from the source in unordered packets. It’s asynchronous, there is a message queue, and what I specialize in is maths.
Now I could spend hours reading documents, putting crap together, and finally come up with some half baked code that ignores most possible error points.
Or I could use ChatGPT and leverage the fact that hundreds of implementations of the same module exist out there. And make something that just works.
That is the first ever coding question I asked an LLM and it literally saved me days of trial and error for something where my added value is next to zero.
Similarly I use LLMs a lot for small tasks that are in fact fairly difficult, and that don’t add any value to the solution. Things like converting data structures in an efficient way using Python idioms, or JavaScript 2023 features, that there is no way I can keep up with.
The thing that makes an agent special is making some kind of effort to gather the relevant context before generating. The quality of the edits from the "agent" panel in Cursor/Copilot/etc is quite a bit higher than the type-ahead suggestions or the inline prompt stuff.
Bizarrely though, it seems to be limited to grep for the moment, doesn't work with LSP yet.
OP: https://fly.io/blog/youre-all-nuts/#but-its-bad-at-rust
> (from https://lukasatkinson.de/2025/net-negative-cursor/)
looks inside
complaining about Rust code
Plus, looks like it just hard coded vales. I see this happen a lot with AI code. Even when I try to get it to not it still tends to do it.
Issues like that are simple and just create debt. Sure, it "works" now but who writes code not knowing that we're going to change things next week or next month. It's the whole reason we use objects and functions in the first place!
Yeah, only in Rust is the maximum value of an unsigned 16-bit integer 65535.
These aren't Rust-specific syntax foibles. It's not a borrow-checker mistake or anything. These are basic CS fundamentals that it's thoughtlessly fumbling.
TBH the critique is completely valid when cursor advertised shitty code on their homepage.
the rust code in question is the example on the cursor landing page though
1 reply →
Adding this comment to my HN bookmarks! Well said
> Most of what we write is not highly creative, but it is load-bearing, and it's full of choices.
The idea that you can't specify the load bearing pillars of your structure to the AI, or that it couldn't figure them out by specifying the right requirements/constraints, will not age well.
> The idea that you can't specify the load bearing pillars of your structure to the AI
But English is a subjective and fuzzy language, and the AI typically can't intuit the more subtle points of what you need. In my experience a model's output always needs further prompting. If only there were a formal, rigorous language to express business logic in! Some sort of "programming language."
1 reply →
What have you written with ai that ha made you or your business money
1 reply →
One of the biggest anti LLM arguments for me at the moments is about security. In case you don't know, if you open a file with copilot active or cursor, containing secrets, it might be sent to a server a thus get leaked. The companies say that if that file is in a cursorignore file, it won't be indexed, but it's still a critical security issue IMO. We all know what happened with the "smart home assistants" like Alexa.
Sure, there might be a way to change your workflow and never ever open a secret file with those editors, but my point is that a software that sends your data without your consent, and without giving you the tools to audit it, is a no go for many companies, including mine.
It's why I use Aider, because it only operates on explicit files that you give it. Works great with OpenAI but if you are really worried, it interfaces perfectly with Ollama for local LLMs. A 12b model on my Mac does well enough for coding that it's serviceable for me.
Which 12b model are you running?
2 replies →
At day job while someone was trying out windsurf, it simply picked up an environment variable that contained sensitive data and used it in code. This is wild.
Sometimes I wonder if all the hype around being left behind and needing to try these things just to see how great they can be is being deliberately put out there to increase their training data.
Too many vibe coders contribute trash code if any. They need more code from so called experts that isn't open source yet.
It's pretty unlikely someone at Cursor cares about accessing your Spring Boot project on GitHub through your personal access token – because they already have all your code.
I don't think that's the threat model here. The concern is regarding potentially sensitive information being sent to a third-party system without being able to audit which information is actually sent or what is done with it.
So, for example, if your local `.env` is inadvertently sent to Cursor and it's persisted on their end (which you can't verify one way or the other), an attacker targeting Cursor's infrastructure could potentially compromise it.
You write your secrets to disk?
Having a gitignored .env file is a pretty common pattern.
1 reply →
The problem with LLMs for code is that they are still way too slow and expensive to be generally practical for non-trivial software projects. I'm not saying that they aren't useful, they are excellent at filling out narrow code units that don't require a lot of context and can be quickly or automatically verified to be correct. You will save a lot of time using them this way.
On the other hand, if you slip up and give it too much to chew on or just roll bad RNG, it will spin itself into a loop attempting many variations of crap, erasing and trying over, but never actually coming closer to a correct solution, eventually repeating obviously incorrect solutions over and over again that should have been precluded based on feedback from the previous failed solutions. If you're using a SOTA model, you can easily rack up $5 or more on a single task if you give it more than 30 minutes of leeway to work it out. Sure, you could use a cheaper model, but all that does is make the fundamental problem worse - i.e. you're spending money but not actually getting any closer to completed work.
Yes, the models are getting smarter and more efficient, but we're still at least a decade away from being able to run useful models at practical speeds locally. Aggressively quantized 70b models simply can't cut it, and even then, you need something like 10k tps to start building LLM tools that can overcome the LLM's lack of reasoning skills through brute force guess and check techniques.
Perhaps some of the AI skeptics are a bit too harsh, but they're certainly not crazy in the context of breathless hype.
What are you talking about? It’s $20/month.
OP is talking about API usage, and yes if you let Claude 4 Opus crunch on a long task it'll eat tokens/money like nothing else.
Those $20/month plans usually come with throttling, or other type of service degradation once you max out their allotment.
The reason that I personally don't use LLMs was not addressed by the article:
I haven't found a way to use it that makes me develop faster.
The articles talks about "tedious code." If you need to generate a large static value table or something, then OK an LLM might give you a really fast result and cut through the tedium. Most of us were already writing short scripts to do that. I'm open to the possibility that an LLM can do it faster. But it's such a rare requirement that the productivity gains are truly negligible here even if they can. And in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.
The article then mentions scaffolding. Things like "bookkeeping" when it comes to creating and setting up a new repo (whatever he means by that). This is why I have, historically, been a big fan of frameworks and generators. Point being, this is already a solved problem and I haven't found a way to further improve the state of this world with LLMs. LLMs might be an alternate tool that work just as well. But they haven't made my existing daily workflow any faster. Setting up new repos is also something that is done so rarely that even if an LLM netted a 100% increase in efficiency, it wouldn't really impact much.
I am an AI "skeptic" but I'm not a naysayer. I do use LLMs regularly. I just don't use them for developing code because I have yet to find a problem that they solve for me. Don't get me wrong, there are problems that they can solve... I just haven't come across any solutions to previously-unsolved problems. Meaning I can swap an existing solution for an LLM-based one... and it is a valid solution... but I don't observe any increase in productivity from doing so. The existing solution was already working fine.
I am genuinely looking forward to the day when this changes. When I identify a single existing problem without an existing solution that LLMs solve for me when developing software. I just have yet to come across one.
>I haven't found a way to use it that makes me develop faster.
I have pycharm pro with the ai. I can right click a function and there's 12 things AI can just do for you.
"find problems" where the linter has no concerns and it finds problems that dont show up from linter are shockingly good. This is like "Hey, you need to make sure this function is only running once at a time or it can easily overwhelm the cpu."
"write documentation" and it just fills in the documentation. I havent a clue what it writes and I wont be doing it.
"generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.
"explain code" look I wrote the code years ago half drunk, half high. No comments. It explains it all in depth and perfectly. Comments sometimes go stale after bug fixes and then make things confusing. Comments arent important when you have this beautifulness.
Lets go back to that cpu DOS attack example. I then can prompt what is claude 3.5 sonnet to give me to solution to fix it. It gives me this beautiful cache option. I could just hit apply and it'd be done. But then i said it's too complicated give me something simpler. It then gives me a solution that is simpler.
>nd in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.
Its all about the boring busy work crap that doesnt need to be done that now gets done.
It's hard to take a developer seriously when they don't document their code and think unit tests are a waste of time.
1 reply →
> "generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.
This statement worries me for a number of reasons.
First, I work on a very massive codebase, with a large engineering organization. And I have seen a number of AI generated unit tests. I have not seen a single example of an LLM-generated unit test that didn't contain a number of test anti-patterns. To the extent where I would argue that they are creating a self-fulfilling prophecy. You said you think that unit tests are a waste of time. I would argue that they CAN be even worse than that.
The benefit of unit tests is that, at their best, they give you a safety net for refactoring existing code. If you change the implementation of a system under test, and the tests pass, you know you didn't introduce a breaking change.
But a lot of tests couple themselves tightly to the implementation details. Every single LLM-generated unit test I have observed in the wild introduces this anti-pattern. If you have a system under test, and changing the implementation of that system without breaking its behaviour causes a test to fail... that's called the "Fragile Test" problem. Now your unit test is not only failing to encourage you to refactor code... it's actively DISCOURAGING you from refactoring code. In this case, the unit test is providing DISVALUE rather than value.
So the fact that a) you think unit tests are a waste of time and b) you look at AI as a way to save you from a chore ... tells me that you have no business ever writing uint tests with or without AI. Please stop. You are making the world worse by leveraging an LLM to do these things for you.
I have NEVER looked at writing a unit test by hand as a "chore" or as a "waste of time." I often write my tests before even writing my implementation code, because doing so helps me think through both the design and requirements of my code... and gives me a little mini-sandbox context where I can make sure that the brand new code that I am writing is doing what I want it to. It's a problem solving tool. Not something to be done after the fact as a chore.
"Write Documentation" is not writing code. And if you don't read the documentation you're generating, no one else will. So what's the point of having it in the first place if no one reads it? Food for thought.
"Find problems" I see as being akin to a spell check, or the red squigglies when you have a syntax error. I do see the value of having ML tech within the internal IDE implementation. We've been using ML for email SPAM filters for decades. There are interesting problems that can be solved there. But this is an area where I want my IDE to flag something for me. It's not me actively prompting an LLM to generate code for me.
2 replies →
The privacy aspect and other security risks tho? So far all the praise I hear on productivity are from people using cloud-hosted models.
Claude, Gemini, Copilot and and ChatGPT are non-starters for privacy-minded folks.
So far, local experiements with agents have left me underwhelmed. Tried everything on ollama that can run on my dedicated Ryzen 8700G with 96GB DDR5. I'm ready to blow ~10-15k USD on a better rig if I see value in it but if I extrapolate current results I believe it'll be another CPU generation before I can expect positive productivity output from properly securely running local models when factoring in the setup and meta.
Almost all of the cloud vendors have policies saying that they will not train on your input if you are a paying customer.
The single biggest productivity boost you can get in LLM world is believing them when they make those promises to you!
> The single biggest productivity boost you can get in LLM world is believing them when they make those promises to you!
I'm having a hard time interpreting what you mean here. It sounds like something straight out of a cult.
5 replies →
> ...have policies saying that they will not train on your input if you are a paying customer.
Those policies are worth the paper they're printed on.
I also note that if you're a USian, you've almost certainly been required to surrender your right to air grievances in court and submit to mandatory binding arbitration for any conflict resolution that one would have used the courts for.
3 replies →
This is probably the biggest danger. Everyone is assuming optimization work reduces cost faster than these companies burn through capital. I'm half inclined to assume optimization work will do it, but it's far from as obvious as they want to portray it.
> So far, local experiements with agents have left me underwhelmed.
Devstral (mistral small fine-tuned for agentic use coding) w/ cline has been above expectations for me.
MacStudio with 512GB RAM starts at around 10k and quantized DeepSeek R1 671B needs around 400GB RAM, making it usable for your needs. It produced some outstanding code on many tasks I tried (some not so outstanding as well).
Am I right in assuming that running Linux (or anything else than macOS) on the MacStudio is experimental at best?
I'd be looking for something that can run offline and receive system updates from an internal mirror on the airgapped network. Needing to tie an AppleID to the machine and allow it internet access for OS updates is a hard sell. Am I wrong in thinking that keeping an airgapped macOS installation up to date would additional infrastructure that requires some enterprise contract with Apple?
1 reply →
Privacy is not binary, and it would make it easier if you outlined specific scenarios.
Most providers promise not to train on inputs if used via an API (and otherwise have a retention timeline for other reasons).
I'm not sure the privacy concern is greater than using pretty much any cloud provider for anything. Storing your data on AWS: Privacy concern?
> Storing your data on AWS: Privacy concern?
Unencrypted? You bet.
My 5 cents would be that LLMs have replaced all those random (e.g. CSS, regex etc) generators, emmet-like IDE code completion/generator tools, as well as having to google for arbitrary code snippets which you'd just copy and paste in.
In no way can AI be used for anything larger than generating singular functions or anything that would require writing to or modifying multiple files.
Technically you might be able to pull off having AI change multiple files for you in one go, but you'll quickly run into sort of "Adobe Dreamviewer" type of issue where your codebase is dominated by generated code which only the AI that generated it is able to properly extend and modify.
I remember when Dreamviewer was a thing, but you essentialyl had to make a choice between sticking with it forever for the project or not using it at all, because it would basically convert your source code into it's own proprietary format due to it becoming so horribly messy and unreadable.
Regardless, AI is absolutely incredible and speeds up development by a great deal, (even) if you only use it to generate small snippets at the time.
AI is also an absolute godsend for formatting and converting stuff from anything and to anything - you could e.g. dump your whole database structure to Gemini and ask it to generate an API against it; big task, but since it is basically just a conversion task, it will work very well.
Here are two routine problems I have to solve at the moment. Can any of the current LLM systems do either?
1. Input is an 256x256 pixel elevation map stored as a greyscale .png file, and a minimum and maximum elevation. A pixel value of 0 corresponds to the minimum elevation, and a pixel value of 255 corresponds to the maximum elevation. Read in the .png file and the elevation limits. Then construct a 256x256 floating point array of heights. From that array, construct a triangle mesh with X and Y dimensions 0..255. Perform a mesh reduction operation on the triangle mesh to reduce the number of triangles. Mesh reduction must not generate holes in the mesh. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.
2. Given four glTF files constructed as above, corresponding to four quadrants of a larger square, construct a single 511x511 mesh which combines all four input meshes to cover a larger area. Because the input meshes are 0..255, not 0..256, there will be gaps where the four quadrants meet. Fill those gaps with reasonable triangles. Perform a mesh reduction as above. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.
Rust code is preferred; Python code is acceptable.
So, what service should I sign up for?
Tackling 1: No current LLM will be able to do that. Create your Rust project, figure out the dependencies that could be useful on your own, then start by asking which types can be expressed. Then after you have revised the types, figure out (with or without LLM) which functions have to be implemented. Then ask LLMs to implement these functions with tests. With some modifications, you should be able to have a final working solution.
Edit: took a while but here is the code generated by DeepSeek: https://gist.github.com/omarabid/b30195011abf570fde2a084cd11... If you have an example file I could give it a test run.
> Tackling 1: No current LLM will be able to do that.
Github Copilot seems to be able to do it. Mostly because it knows how to call Meshlab. The documentation for Meshlab is hard to search, and having an LLM find the correct function is itself helpful.
Also, it looks like quadric decimation has improved since the last time I looked at it. That algorithm, if pushed too hard, will generate meshes with holes. Now there are options ("preservetopology") which supposedly prevent that. I last looked at that a few years ago, and the only mesh decimator which handled that correctly was the one inside Unreal Engine Editor.
Now I have to test all this stuff, of course.
How have you found LLMs' Rust capabilities compared to Python or Typescript or other more common languages with more publicly available data?
2 replies →
Why don't you paste that into Claude, ChatGPT — report back what you find?
I tried Github Copilot, free mode, on #1. The Python code outsourced all the hard work to numpy and pymeshlab, which is fine. Copilot wrote code to generate triangles. It's a reasonable job. Copilot's knowledge of what to call is better than mine. I have to try it now, but it did OK.
On #2, it set up the problem, but bailed on the gap filling part with
That's not surprising, since I didn't tell it how to solve the problem. Can any of the premium systems do #2?
5 replies →
This is a great example.
Supposedly, you know something about this problem. If you use an LLM to craft a solution, that's probably fine since you'd know how well you did.
I know nothing about this problem. I could use an LLM to delude myself into thinking that I do. Should my boss be very happy about that? They probably shouldn't
try claude code, in python
Weird to claim the llm does all the boring learning and boilerplate for you as a selling point, but then also insist we still need to responsibly read all the output, and if you can't understand it's a "skill issue".
Also the emphasis on greenfield projects? Starting is by FAR the easiest part. That's not impressive to me. When do we get to code greenfield for important systems? Reminds me of the equally absurd example of language choice. You think you get to choose? What?
Imagine all the code these agents are going to pump out that can never be reviewed in a reasonable time frame. The noise generated at the whim of bike-shedding vibe coders is going to drown all the senior reviewers soon enough. I'll call that Cowboy Coders on Steroids. Anyone with skills will be buried in reviews, won't have time for anything else, and I predict stricter code gen policies to compensate.
I don't have a huge dog in this fight apart from AI advocates being annoying... but I would say that for greenfield projects the interesting thing is that I can get a skeleton of a working iOS app for something simple in like an hour of some copy/pasting stuff from ChatGPT. Instead of spending a good amount of time trying to get through learning material to do it.
It's nice to build throwaway things _so fast_, especially in the sort of fuzzy stuff like frontend where it's fine for it to be completely wrong. And then I can just use my own sense of how computers work to fix up what I care about, delete a bunch of what I don't care about... It's pretty amazing.
For existing projects I have only witnessed garbage output. I know people have success. I haven't seen it.
I have witnessed PMs taking a bullet pointed list of requirements and then use ChatGPT to generate paragraphs of text for some reason. You had the list!
And (rigorously) reviewing code is easily 10x harder than writing it.
But this won't stop it from happening.
This is just obviously not true. I had a full-time job of reviewing code for roughly 15 years and it was never true, but it's also just intuitively not true that engineers spend 10 hours reviewing their peers code to every 1 they spend writing it.
What you mean to claim here is that verification is 10x harder than authorship. That's true, but unhelpful to skeptics, because LLMs are extremely useful for verification.
13 replies →
I really don't think that's true at all. Anyone here can read and sign off on 1000 lines of code faster than they can write the same 1000 lines of code, pretty much no matter what the code is.
2 replies →
You have misunderstood this "common wisdom".
The idea is that people will spend 10x more time reading your code in all future time stacked together. Not that reading and understanding your code once takes 10x the effort of writing it, which is obviously untrue.
Here is the quote from Clean Code, where this idea seems to originate from:
> Indeed, the ratio of time spent reading versus writing code is well over 10 to 1.
I think the point is that the person orchestrating the agent(s) reviews the code. It doesn't make sense to have 5 Juniors using agents to submit PRs and a senior or two reviewing it all. You just have the senior(s) orchestrating agents and reviewing it themselves. Maybe one or two juniors because we still need to train new devs, but maybe the junior doesn't even use an LLM. Maybe the junior writes code manually so they can learn programming properly before they start being an "AI lead".
Everyone is still responsible for the code they produce. I review my own PRs before I expect others to, and I don't even use AI. I think what the article describes seems interesting though.
Yeah, I remember five years ago (before being sold this latest VC crock of bull) when reviewing shitty thoughtless code was the tedious and soul-sucking part of programming.
This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
It was obviously huge. You could see it taking off. Yet a lot of people proudly displayed ignorance and backed each other up on it to the point that discussion around the topic was often drowned out by the opposition to change. Now today it takes minutes of playing with ai coding agents to realise that it’s extremely useful and going to be similarly huge.
Resistance to change is not a virtue!
I still think smartphones are a huge negative to humanity. They improve a narrow case: having access to ephemeral knowledge. Nobody writes articles or does deep knowledge work with smartphones.
My position with the AI is almost the same. It is overall a net negative for cognitive abilities of people. Moreover I do think all AI companies need to pay fair licensing cost to all authors and train their models to accurately cite the sources. If they want more data for free, they need to propose copyright changes retroactively invalidating everything older than 50 years and also do the legwork for limiting software IP to 5 to 10 years.
Agreed. Smartphones are portable, mobile computers that suck at every single aspect of being, and working as, a computer, except for mobility. They are not powerful, they are not general purpose, they are not ergonimic, they do not respect user freedom or privacy. Use such a mobile device only when you can't avoid it (i.e., when you are on the road -- when mobility is the single most important priority), and at no other time.
"Nobody writes articles or does deep knowledge work with smartphones."
I don't think that's true.
I do most of my reading on a smart phone - including wading through academic papers, or reading full books in the kindle app and jotting down notes in the digital margins.
A sizable number of my short form blog entries are written on my phone, and my long form writing almost always starts out in Apple Notes on my phone before transferring to a laptop.
Predictive text and voice dictation has got good enough now that I suspect there have been entire books written on mobile devices.
Whether you want to consider it "deep knowledge work" or not is up to you, but apparently a lot of Fifty Shades of Grey was written on a BlackBerry! https://www.huffpost.com/archive/ca/entry/fifty-shades-of-gr...
> It is overall a net negative for cognitive abilities of people.
I agree. A bunch of us here might use it to scaffold applications we already understand, use it as a rubber duck to help understand and solve new problems, research more effectively, or otherwise magnify skills and knowledge we already have in a manner that's directed towards improving and growing.
That's cool. That's also not what most people will do with it. A bunch of us are total nerds, but most of the world really isn't like that. They want more entertainment, they want problems solved for them, they want ease. AI could allow a lot of people to use their brains less and lose function far more. For the minority among us who use it to do more and learn more, great. That group is a tiny minority from what I can tell.
Take for example that a huge use case for generative AI is just... More sophisticated meme images. I see so much of that, and I'm really not looking for it. It's such an insane waste of cycles. But it's what the average person wants.
Just smartphones? I'd start at agriculture. Before agriculture, human society had little hierarchy. We were free the way we were meant to be.
> today it takes minutes of playing with ai coding agents to realise that it’s extremely useful
Yet some of us spent hours over the past three years playing with LLMs, and remain completely unimpressed by what we see.
Well don't you realise you need to try Jean 4.31 or Cocaptain 8.3E and then you'll see what the models are capable of!
1 reply →
Smartphones were financially viable from day 1, though. I think LLMs will be used a lot and in a lot of places but the current level of investment they're getting right now feels out of line to me. Kind of like what I expect them to get in 10 years from now, when they're mature.
> Smartphones were financially viable from day 1, though.
Also, they were nothing more than the combination of two things which already existed, and which were already successful and financially viable: cellular phones and PDAs. In fact, I (and plenty of others I presume) already used that combination before smartphones: a PDA and a cellular phone, connected through Bluetooth to share the network connection.
Some days i still don't understand why anyone would want a smart phone. I think being connected all the time has a significant negative impact on mental health (I say, as i type this from a smartphone)
Sometimes it works that way.
Sometimes it’s more like NFTs.
Sometimes. Most people with an opinion of NFTs thought they were a joke. Hardly anyone thinks LLMs are a joke.
1 reply →
Slashdot then and HN now, predicted 100 out of the last 10 recessions.
> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
I am in my 40s; have never owned a smartphone and still can't imagine wanting one; am constantly surrounded by others who have them and thus am completely aware of their functionality; AMA.
And today everybody has a smartphone, pretty much. So what difference did it make, the opinion you had, whatever it was? In the end, none at all.
It makes a huge different. Not exactly for the better, I have to say.
Unfortunately, they were right. Smartphones are a menace, especially in terms of traffic violence.
> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.
This is a funny example because the devastating effects of smartphone addiction on society are now blatantly obvious. There are in fact very good reasons for not 'wanting such a thing'.
Don't get me wrong, LLMs can be incredibly useful and I think they deserve some of the current hype. Claiming that LLMs are useless is indeed silly and can be rightfully ignored. But there are serious concerns about potentional (or actual) negative effects on society, and these should be taken seriously.
To be frank, I do think smartphones have made my life worse. I'd happily forego them if it were not for 2FA and how too many businesses expect I can receive texts.
> I'd happily forego them if it were not for 2FA and how too many businesses expect I can receive texts.
Vertical flip phones from forever ago can handle both of these just fine.
2 replies →
The one main claim the article makes: Senior developers should not ignore the productivity gains from LLMs.
Best use of evidence is deductive: Lots of code is tedious and uninteresting -> LLMs are fast at generating lots of tedious code -> LLMs help productivity.
Weakest part of the argument: The list of rebuttals doesn't have an obvious organization to it. What exactly is the main argument they're arguing against?
It's not stated outright but because the post is bookended by references to 'those smarter than me', I think this is an argument against the shaming of developers using (and loving) LLM tools.
Which I think is fair.
Overall, the post did not add anything to the general discussion. But the popularity of the author (and fly.io posts) may make it a beacon for some.
> An LLM can be instructed to just figure all that shit out. Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better.
Well, except that in order to fix that 1% you'd need to read and understand whatever the LLM did and then look for that 1%. I get the shills just thinking about this, whether the original programmer was human or not. I'd rather just write everything myself to begin with.
Reading code is vastly quicker than writing it from scratch.
This is a lie, unless you are just speaking of the mechanics of cursorily skimming it. Reading code to understand it enough to commit it in my name is harder and slower than writing it myself.
22 replies →
But creating a mental model and then interacting with it is slower than interacting with a mental model you already have.
If you are resource constrained this article will make you sad.
I do not have unlimited funds to plug in some token and burn a bunch of money when writing code.
I am gpu poor. I'm lucky that 8gb vram can run the smallest models. But the output is so poor that I lose out to anyone using a hosted service.
If anything this article shows that building great programs is less democratized than it once was.
ChatGPT is $20 / month?
And Gemini is free. https://aistudio.google.com/ gets you free access to their best models.
OpenAI and Anthropic both have free plans as well.
3 replies →
A couple of thoughts:
On 'just read the code' - all well and good. Of course this implies insisting on team members who can read the code, will read the code and are empowered to read the code.
Otherwise orgs will try to hire pure vibe coders who aren't interested in that and only allow time for literally just instructing agents to generate code because it sounds cheaper and execs don't understand the nuance so long as it looks like product is shipping - until it all blows up and the one standing senior developer on hand is supposed to fix a prod issue buried in millions of lines of vibe reviewed code ASAP.
On 'but it's cheaper than a junior': cloud hosted LLM systems are currently massively subsidised to an absurd degree. The cost side of things is all smoke and mirrors geared towards accelerated market adoption at all costs. It's not a profitable enterprise at the model development level.
At some point that AI economy is going to expect to make that money back, and future (especially near-future) hardware advancements don't explain where all of that is going to come from.
Let’s not conflate LLM’s with AI. LLM’s are a kind of AI that can be a software engineer’s assistant, at best. But the degree of hype is undermining belief in AI among many professionals who the hypesters claim are going to be replaced. No, this iteration is not going to replace doctors or engineers. But the degree to which the hype train wants to do so is alarming.
I'm curious what walls you've hit? I literally can't find the boundaries of what is possible to achieve.
Search engineering forums online. If you haven’t found the walls, you haven’t been trying.
Have you tried creating your own programming language? How about solving unsolved frontier problems in mathematics? Ever written a book that won a Pulitzer prize? How many languages do you know?
As someone who was born ambitious I find this technology tepid at best.
The only question for me is whether or not the walls are moving in the direction of greater actualized capability.
I can't prove it but I *think* the answer there is yes.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.
You still need to read this tedious code to verify that it actually does what you want it to do. Given this, I'd much rather prefer to write the tedious code myself than having to make sense of someone else's tedious code.
I've reproduced this type of sentiment over several of my hobby projects and it usually hits hard toward the latter half of the project and its usually related to bugs. In the beginning it's all great and fast with LLMs, but then there's a kind of steep drop in motivation as one realizes one has to dig as deep anyway into the tedious stuff, because there are always bugs.
I'm working with python and it does not work very well, LLMs usually generates at least an order of magnitude more code than what I would write. That code often uses outdated practices, does poor design choices and does not understand hints like writing code in a way that reduce cognitive load, even when we explain that it means, i.e. keep the number of classes and functions small.
It's the complete opposite of OP's main language GO, as he says:
> Go has just enough type safety, an extensive standard library, and a culture that prizes (often repetitive) idiom. LLMs kick ass generating it.
Python is an interpreted dynamically typed language and the static type checkers are not there yet (most popular 3rd parties libraries have no type hints for example). Also it allows for many different programming styles that the LLMs struggle to choose from.
1. Every extra line of code is much more risky. 2. It's much harder to verify the LLM's code.
On the other hand I think rust will be in a good place in regards to LLMs in the next few years thanks to the robustness of the language and the quality of its diagnostic messages. Those 2 attributes should compound very well.
I use Cursor with Python and it works great. Things you should try:
I'm on the side of AI is a huge productivity booster (by my guess... 10x)
But I don't want to make the claim lightly, so I did an experiment.
I signed up for copilot pro, and have been using their 'edit' feature. This is more than just their auto complete.
I set myself a goal to create a playable web game of classic frogger.
It took 4 hours with copilot "edit" and my full attention.
I didn't write a single line of code, but I did ask it to refactor and gave it a project description.
I suspect this would have taken me 4 days full time to get to this level.
Try it out:
https://dev.zebrar.com/jd/frogger/
So less than 10x already.
Question, how familiar are you with the used technologies? My experience on where Ai have been useful so far is things I don't have a good understanding on but, when I do, its a different ball game, mostly because coding it directly seems faster since I know exactly the behaviour I am looking for and I am not having to deal with unintended consequences.
I see it as the Alice cat thing, when you don't know where you going, any road will take you there. So its been great for exploratory work and prototyping.
Yeah, I'm very familiar with the tech, I've been interested in games dev and web dev for a few decades now. So you could be right, that the models aren't ready to "play on their own" yet.
I tried doing a warcraft 1 clone, but that felt too complex for the model being used (openai 4.1). That model was just the default setting in copilot.
I dug a little deeper this morning, and it turns out I hadn't actually enabled my copilot 'pro' mode, which has granted access to some more current or dev focused models. So I'll take them for a spin to see what they're capable of.
My goal here is to roughly get a sense for when a task is too complex for an "agent" to handle.
I also want to try adding custom tools to suit certain project needs. For example, Unreal Engine has a python editor interface, so I'd like to have an agent drive the editor to build something. I have my doubts.
Once I have a feeling for what level of complexity can be handled, I'll see if I can manage the tools better using this understanding, by breaking large and complex projects into appropriate chunks of work / complexity.
/endBrainDump
* It's not mobile friendly, and you use arrow keys to move around
> a playable web game of classic frogger
You could create a Tetris in even less time. Try to create something more novel and less hot on the heatmap of paths walked in gamedev.
I will. I'm just trying to evaluate the tools on a medium size task. I'm trying copilot's agent mode with the goal of creating a warcraft 1 clone. Or something resembling it.
I think the key premise here is that one can effectively and efficiently audit code that the LLM is producing.
I doubt that. First, human attention and speed is very limited. Second, when I see something, I am already predisposed to assume that it is right (or at the very least, my subsequent inquiries are extremely narrow and anchored around the solution I have seen presented to me.)
Yeah. Reviewing code thoroughly is extremely time consuming. When you review code from a human, you can look for choices they made fairly quickly - say they used framework X and language feature Y. Most importantly, you assume that they’ve checked certain things to work. This way, code review can be fast but it still isn’t thorough on its own. Most of it is trust and bureaucracy (big companies do this also to prevent malicious employees from smuggling in backdoors etc).
Code from LLMs that looks right, clean and even clever poses as competence but are prone to hallucinations and business logic errors. In the short term, these changes will pass through due to their appearance but contain more issues than a human would have with the same code. In the medium term, we just lose that signal - the assumptions we can make about the authors state of mind and comprehension. It’s already incredibly hard to distinguish solid points from nonsense, when the nonsense is laundered by an LLM.
Yeah the problem is the fatigue of repeatedly reviewing code.
You do a few iterations until code runs, review carefully but notice a bug. So you do another iteration and 40% of code changes. Now you need to review again but you need to understand how the changes fit in.
Repeat this a few times and it becomes very tiring.
Ultimately you can't trust them not to do stupid shit. Your tests fail and you tell it to stop that? Sure, we can just catch those exceptions and the tests pass, etc. You get pissed off an tell it to FIX the CODE so the tests pass and the cycle continues.
It's like working with a potentially gifted moron.
Exactly. And even if LLMs become more accurate and qualitative better, it just raises the costs for validation. Errors and bad design decisions become harder to spot.
A lot of people who are wary of LLMs aren’t against the technology itself, but rather the narrative surrounding it. You can take advantage of the tool while turning a blind eye to the discourse.
This 16-minute, expletive-filled, edgy-old-man-trying-too-hard-to-be-cool article could easily be dismissed as yet another AI creed that somehow found its way to the top of the HN front page.
I will create an account just to point out that the response to
> But it is an inherently plagiarist technology
Was
> Developers frequently engage in copyright infringement, and so will I, so unless if you're a lawyer, shove it up your ass
"I am a bad person so I get to continue being bad" is not the gotcha you think it is, Patrick.
When I made a mid career change to Cobol programming in 1981 my first manager (a suit as programmers referred to them then) pointed to a book on his desk; the title,as I can best recall was Programming Without Programmers. He "You got in too late," he said. I retired from programming in 2010. The hype has a long history. I hope I'm around long enough to see how this plays out.
"Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged."
I would jump off a bridge before I accepted that as my full-time job.
I've been programming for 20+ years and I've never wanted to move into management. I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with. I've been in a lead role, and I certainly do lots of code review and enjoy helping teammates grow. But the last fucking thing I want to do is delegate all the code writing to someone or something else.
I like writing code. Yes, sometimes writing code is tedious, or frustrating. Sometimes it's yak-shaving. Sometimes it's Googling. Very often, it's debugging. I'm happy to have AI help me with some of that drudgery, but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers.
I get up in the morning to make things, not to watch others make things.
Maybe the kind of software engineering role I love is going to disappear, like stevedores and lamplighters. I will miss it dearly, but at least I guess I got a couple of good decades out of it. If this is what the job turns into, I'll have to find something else to do with my remaining years.
I feel this so much.
I am always amazed how so many software engineers seem to dislike coding, which seems to be a major underlying theme in the AI-coding cheerleading.
Coding never feels tedious to me. Talking to a chatbot, now that’s tedious.
God. Thank you.
> But the last fucking thing I want to do is delegate all the code writing to someone or something else
I talked about this a few days ago with coworkers and I phrased it almost exactly this way. I want to be an IC. I want to do the work. Becoming a manager-of-one to funnel all my work through would deprive me of genuinely my greatest joy in life.
I'm envious you managed a few good decades. I'm almost eight years into what has been my dream profession and to see indications it's going to morph from writing code into telling a robot to write code is just so demoralizing. I could have done this for decades. I wanted to do this for decades.
Keep it up. There'll be some rough turbulence until this transhumanist trainwreck dust settles. The human spirit shall prevail.
That snippit you quoted sounds like what some aristocratic fop that has never worked a day in their life creative writes while they are LARP'ing what they think the poor's ideal workday would sound like in an AI sales pitch meeting.
It rings about as authentic as "everybody stood up and clapped"
You say that, but I hear AI folks talk about that user experience all the time.
And at some level, it makes sense. There are a lot of extroverts out there for whom the ideal job really is feeling like an orchestra conductor delegating and coordination an army of others. There is a great feeling of empowerment in watching a group build something bigger than you could have made on your own. And, if you're not someone who really likes to get their hands dirty and do the thing yourself, why not aspire towards a style of working that always feels that high level and powerful?
And, obviously, people who are wired that way are exactly the kind of people to move into management and executive roles and end up being in charge of and evangelizing AI to the world. So it's no surprise you hear people talking about this style of working all the time.
I'm 47. I've had the opportunity to move into management many times. I've been in a lead role enough to know the positive sides of that job and I deeply enjoy the soft skills part of working with a team.
I just know who I am and know that on days where I don't feel like I really made something concrete myself, I feel like I didn't live a real day.
1 reply →
Same, thank you.
I have a friend and coworker who is currently struggling with this thing and similar "not making anything" issues from having worked a "maintenance/support" job consisting of little day-to-day make-creation. He took a leave for two months. I really don't think he's coming back.
It is a loss for people he works with and the things he touches. Not to mention a loss for me, because he was the one person I found at my current job with the "make stuff, be useful" attitude. Already people who see that writing on the wall are dropping out.
Sooner or later, me too, if that is in fact the way it goes. The only thing that really keeps me going is that I don't have anything else to turn to, and I do have some cloudy optimism about getting a lot of money and satisfaction in a couple years to help clean up or rework the garbage-deluge.
Your other comment about extroversion makes me think of Gifts Differing (not the best book in the universe, but some good bits especially around intro/extroversion). I just want to hit flow state and smash my head into a problem that costs me sleep and come out on top bloodied but happy and feeling like I've done something. And this is/was a really great career for that.
> This faculty of concentration is likely to characterize the introverts' careers. Whereas extraverts tend to broaden the sphere of their work, to present their products early (and often) to the world, to make themselves known to a wide circle, and to multiply relationships and activities, the introvert takes the opposite approach. Going more deeply into their work, introverts are reluctant to call it finished and publish it, and when they do, they tend to give only their conclusions, without the details of what they did. This impersonal brevity of communication narrows their audience and fame, but saves them from overwhelming external demands and allows them to return to another uninterrupted stretch of work. As Jung is reported to have said, the introverts' activity thereby gains in depth and their labor has lasting value.
Thank you so much for writing this. I started feeling like an oddball after reading most of the previous comments here.
Thank you, that's what I feel too.
For me switching the career, after spending more than 20 years in this as well... is very hard. I spent all my career outside of high-pay places like SV telling myself "I have all the time in the world, I don't need to grab most amount of money as soon as possible", so retiring is not an option.
So, switch to what? Any well-paid profession is going to be under pressure to be LLMized as much as possible.
> I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with
oh finally someone else who didn't enter programming because, as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.
> But the last fucking thing I want to do is delegate all the code writing to someone or something else
Thank God there is at least one other person that understands that the ratio between creative and reactive work is crucial for wellbeing at the job.
For crying out loud.
> but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers
so am I.
> but at least I guess I got a couple of good decades out of it
Thanks for this perspective. Yes, at least we've got our memories, and the code locations and commits we recall from memory, from a distance of 10 or more years.
>. If this is what the job turns into, I'll have to find something else to do with my remaining years
Me too.
> as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.
Some of my fondest childhood memories are sitting in my school's resource center in front of a TRS-80, laboriously typing in some mimeographed BASIC code while wondering, "Is this the most efficient way I can increase shareholder value for the corporation?"
I'm with you two. I can work on boring problems in boring domains and enjoy the design and implementation aspect because it's inherently creative. Take away those and now my only job is the guy in Office Space who takes the requirements from the customers to the developers, if I'm lucky enough to have one at that point.
I don't want to have to change careers, as this is one that I've been working towards to some degree since I was a child. Including some intense work in college and some brutal first couple jobs where I worked hard to pick up marketable skills. Obviously the market doesn't care what I want, but I find the author of this piece to be a bit too flippant in his "but they take-rr jerbs" section. Working hard to get a well paying job only to have to start (likely near the bottom) in another career for much less pay is not something to treat lightly.
Tell me when this bus leaves because I’ll get on it with you.
> I get up in the morning to make things, not to watch others make things.
If you ever followed down the rabbit hole of children content on YouTube, the one that infants and toddlers end up after an hour of autoplay with some innocent cartoon at the beginning, you'd find among the piles of morbid psychotic trash a variety of videos of children watching someone else playing some games, a whole class of videos in itself.
I can't explain this with anything else but a coordinated effort to flash the norm of watching someone/something doing something as opposed to being engaged in immediate action yourself into the firmware of future people. This lines up with many other current trends pretty well. Talk about incrementing Baudrillard's level of simulation a notch up.
There are many aspects to AI push back.
- all creatives are flat against it because it’s destroying their income streams and outright stealing their intellectual property
- some technical leaders are skeptical because early returns were very bad and they have not updated their investigations to the latest tools and models, which are already significantly ahead of even six months ago
- a tech concern is how do we mentor new developers if they don’t know how to code or develop logic. LLMs are great IF you already know what you’re doing
- talent is deeply concerned that they will be reduced and replaced, going from high paying careers to fast food salaries
We have a lot of work to balance productivity with the benefits to society. “Let them eat cake,” is not going to work this time either.
Yes, this is a solid succinct summary.
I’d also extend the last point:
> talent is deeply concerned that they will be reduced and replaced, going from high paying careers to fast food salaries
The delta in that money going to the investor/tech exec instead. Yet another wealth transfer from the middle class to the 0.01%.
I tend to agree with the gist of this, namely that the change is here and "AI" presents some huge potential to save me from having to do the tedious stuff. But we do need a bit of a sanity check. I'm in the middle of trying to convince any of the AIs that I have access to to write me some simple Lua. Specifically, I'm trying to write a query that returns a certain subset of pages from the Silverbullet V2 note tool. This isn't particularly difficult, but it's become this hilarious journey to try to get the AI to figure this problem out. Silverbullet is a niche use case, v2 even more so, but a reasonably skilled developer could peruse the API documentation and come up with a pretty good starting point. AI tools? Absolutely insane wrong answers. I finally specifically asked one or two of them not to guess with their answers and they just straight up said "nah, we don't know how to do this."
Point being: there's some real power in these tools, but if the ground is not already well-trodden, they risk creating a huge time sink that could be better served just learning to do the thing yourself.
Don't you think the AI might too, if you passed it the API docs?
Not the ones I’ve tried. They make a go of it but their results have been generally inferior, often wildly so. Because they can’t make intuitive connections as easily, I assume they need both the rules and examples of implementations to produce workable output.
[dead]
Seeing everything these days being about vibe coding, I feel a little old with my VIM setup and my LSP servers who I already thought were a nice productivity increase.
The problems I have with the stuff relating to MCP is that the tech around it is developing so fast that it's hard for outsiders to catch up with what the best working setup is, for example.
What would you do, for example, if you want to selfhost this?
- which models (qwen ai coder?)
- which api (with ollama? Bolt? Aider? Etc)
- how to integrate PRs with a local gitlab/gogs/forgejo instance? Do you need another MCP agent for git that does that?
- which hardware dependencies to run it?
I am currently trying to figure out how to implement a practical workflow for this. So far I'm using still a synchronous MCP agent setup where it basically runs on another machine in the network because I have a too unperformant laptop to work with.
But how would I get to the point of async MCP agents that can work on multiple things in my Go codebases in parallel? With the mentioned PR workflows so that I can modify/edit/rework before the merges?
The author makes a lot of claims and talks always about that their opponents in the argument are not talking about the same thing. But what exactly is the same thing, which is reproducible locally for everyone?
I haven't even gotten around the integrating LSPs into my vim workflow -- I feel positively ancient.
I can recommend govim, it's using LSP behind the scenes. I'd also recommend tagbar because it uses the newer ctags fork which supports much more languages.
[1] https://github.com/govim/govim
[2] https://github.com/preservim/tagbar
3 replies →
I have https://github.com/prabirshrestha/vim-lsp in my minimal config and pretty much the default configuration for it. I only use LSPs for completion, auto imports, rename, and goto definition (less useful when the project is small) and that has been enough for me.
The self hosted and vim / nvim / emacs integrated llm / agentic / MCP situation is being implemented real time as we speak. There's no good canonical solution right now, especially for self hosting. If you are willing to make API calls to openapi or Microsoft, your options expand dramatically.
I think most people are using off the shelf paid solutions like cursor, windsurf, warp terminal, Claude code.
I'd love to get a set up with a locally hosted full version of deepseek.
I use vim as well, and one editor-agnostic coding assistant that is pretty decent is Aider. It is a little buggy sometimes, but it's been a blast to experiment with. I run it in a separate split with Tmux. Its whole schtick is to use it as a pair programming device. So I'll edit code in one split and use Aider in another, alternating at key points. This has worked pretty well for personal projects. I typically don't do this at $dayjob, though.
Why do you want to selfhost it? I just use https://github.com/Exafunction/windsurf.vim in NeoVim like a super-autocomplete.
Multiple reasons:
- Germany, so I have crappy and unreliable internet (read as: 300kB/s or less pretty often due to overbooked cable internet connections).
- Can't share my code with the US (as a European) because I work in cyber defense.
- I want a reproducible work setup, and something that will survive 2 years and/or the startup exit of the proposed tools.
1 reply →
You should add Claude code to your setup and be amazed. Tmux with vim, Claude code and git work trees is amazing - now the choke point is… me.
I was not aware you could self-host Claude Code.
[flagged]
Can someone comment on the cost of running agentic models? Not for a company but for an individual. I tried "vibe coding" a personal project I was struggling with and left even more frustrated because I kept running into token rate limits with Claude (used inside of Zed if it matters). Did I pick the wrong model, the wrong editor, or do I just need to not be so tight with my money?
I tried Zed's agent support with Copilot. I was able to get it to implement a working feature with tests, but the code quality was poor and it took longer to do it than if I had written it myself.
I am wondering if maybe the average programmer is way slower and worse than I thought.
I haven't used zed specifically, but were you using your own API key for Claude? If so, you were probably running into the Anthropic API rate limits [1]. You can either a) deposit more funds to move your way up the tier list, or instead access Claude via something like OpenRouter, which will give you much higher limits.
[1] https://docs.anthropic.com/en/api/rate-limits
Try Cursor.ai or Windsail. They have free trials for the good models, and are pretty cheap to use if you decide you like them.
It’s not you. The emperor does not have any clothes
While i think i know what you're getting at, for the sake of discussion, could you elaborate?
Someone running out of tokens is not relevant to the article or its argument.
I had good luck with Zed w/Claude. Did you try max mode?
Machine translation and speech recognition. The state of the art for these is a multi-modal language model. I'm hearing impaired veering on deaf, and I use this technology all day every day. I wanted to watch an old TV series from the 1980s. There are no subtitles available. So I fed the show into a language model (Whisper) and now I have passable subtitles that allow me to watch the show.
Am I the only one who remembers when that was the stuff of science fiction? It was not so long ago an open question if machines would ever be able to transcribe speech in a useful way. How quickly we become numb to the magic.
That's not quite true. State of the art both in speech recognition and translation is still a dedicated model only for this task alone. Although the gap is getting smaller and smaller, and it also heavily depends on who invests how much training budget.
For example, for automatic speech recognition (ASR), see: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
The current best ASR model has 600M params (tiny compared to LLMs, and way faster than any LLM: 3386.02 RTFx vs 62.12 RTFx, much cheaper) and was trained on 120,000h of speech. In comparison, the next best speech LLM (quite close in WER, but slightly worse) has 5.6B params and was trained on 5T tokens, 2.3M speech hours. It has been always like this: With a fraction of the cost, you will get a pure ASR model which still beats every speech LLM.
The same is true for translation models, at least when you have enough training data, so for popular translation pairs.
However, LLMs are obviously more powerful in what they can do despite just speech recognition or translation.
What translation models are better than LLMs?
The problem with Google-Translate-type models is the interface is completely wrong. Translation is not sentence->translation, it's (sentence,context)->translation (or even (sentence,context)->(translation,commentary)). You absolutely have to be able to input contextual information, instructions about how certain terms are to be translated, etc. This is trivial with an LLM.
14 replies →
It's not the speech recognition model alone that's fantastic. It's coupling it to an LLM for cleanup that makes all the difference.
See https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-rec...
(This is not the best example as I gave it free rein to modify the text - I should post a followup that has an example closer to a typical use of speech recognition).
Without that extra cleanup, Whisper is simply not good enough.
> However, LLMs are obviously more powerful in what they can do despite just speech recognition
Unfortunately, one of those powerful features is "make up new things that fit well but nobody actually said", and... well, there's no way to disable it. :p
That leaderboard omits the current SOTA which is GPT-4o-transcribe (an LLM)
2 replies →
> Am I the only one who remembers when that was the stuff of science fiction?
Would you go to a foreign country and sign a work contract based on the LLM translation ?
Would you answer a police procedure based on the speech recognition alone ?
That to me was the promise of the science fiction. Going to another planet and doing inter-species negotiations based on machine translation. We're definitely not there IMHO, and I wouldn't be surprised if we don't quite get there in our lifetime.
Otherwise if we're lowering the bar, speech to text has been here for decades, albeit clunky and power hungry. So improvements have been made, but watching old movies is a way too low stake situation IMHO.
this is very dismissive and binary and thats what this whole article is about. AI skeptic expect that's either AGI or perfect with all use cases or otherwise its useless. SST, translation and TTS went really far away in last 2 years. My mother who doesn't speak english find it very useful when she my sister in US. I find it super useful while travelling in asia. Definitely much more useful than what we had in Google Translate.
1 reply →
We have the tools to do this, and will have commercial products for everything you listed in the next couple years.
> Machine translation and speech recognition.
Yes, yes and yes!
I tried speech recognition many times over the years (Dragon, etc). Initially they all were "Wow!", but they simply were not good enough to use. 95% accuracy is not good enough.
Now I use Whisper to record my voice, and have it get passed to an LLM for cleanup. The LLM contribution is what finally made this feasible.
It's not perfect. I still have to correct things. But only about a tenth of the time I used to. When I'm transcribing notes for myself, I'm at the point I don't even bother verifying the output. Small errors are OK for my own notes.
Have they solved the problem of Whisper making up plausible sounding junk (e.g. such that reading it you would have no idea it was completely hallucinated) when there is any silence or pause in the audio?
1 reply →
I completely agree that technology in the last couple years has genuinely been fulfilling the promise established in my childhood sci-fi.
The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
Two years ago this story would have been: notice the special board, realize I don't quite understand all the characters well enough to choose or order, and turn wistfully to the menu to hopefully find something familiar instead. Or skip the bistro and grab a pre-packaged sandwich at a convenience store.
> I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
> Two years ago
This functionality was available in 2014, on either an iPhone or android. I ordered specials in Taipei way before Covid. Here's the blog post celebrating it:
https://blog.google/products/translate/one-billion-installs/
This is all a post about AI, hype, and skepticism. In my childhood sci-fi, the idea of people working multiple jobs to still not be able to afford rent was written as shocking or seen as dystopian. All this incredible technology is a double edges sword, but doesn't solve the problems of the day, only the problems of business efficiency, which exacerbates the problems of the day.
2 replies →
>The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.
To be fair apps dedicated apps like Pleco have supported things like this for 6+ years, but the spread of modern language models has made it more accessible
[flagged]
Definitely not. I took this same basic idea of feeding videos into Whisper to get SRT subtitles and took it a step further to make automatic Anki flashcards for listening practice in foreign languages [1]. I literally feel like I'm living in the future every time I run across one of those cards from whatever silly Finnish video I found on YouTube pops up in my queue.
These models have made it possible to robustly practice all 4 quadrants of language learning for most common languages using nothing but a computer, not just passive reading. Whisper is directly responsible for 2 of those quadrants, listening and speaking. LLMs are responsible for writing [2]. We absolutely live in the future.
[1]: https://github.com/hiandrewquinn/audio2anki
[2]: https://hiandrewquinn.github.io/til-site/posts/llm-tutored-w...
Hi Andrew, I've been trying to get a similar audio language support app hacked together in a podcast player format (I started with Anytime Player) using some of the same principles in your project (transcript generation, chunking, level & obscurity aware timestamped hints and translations).
I really think support for native content is the ideal way to learn for someone like me, especially with listening.
Thanks for posting and good luck.
Translation seems like the ideal application. It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation. Even if it has to spit out three versions to perfectly communicate, it’s still leaps and bounds ahead of traditional translators already.
> it’s still leaps and bounds ahead of traditional translators already
Traditional machine translators, perhaps. Human translation is still miles ahead when you actually care about the quality of the output. But for getting a general overview of a foreign-language website, translating a menu in a restaurant, or communicating with a taxi driver? Sure, LLMs would be a great fit!
4 replies →
> It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation.
Somehow LLMs can't do that for structured code with well defined semantics, but sure, they will be able to extract "obscure references" from speech/text
3 replies →
I started watching Leverage, the TV show, on Amazon, and the subtitles in the early series are clearly AI generated or just bad by default.
I use subtitles because I don’t want to micromanage the volume on my TV when adverts are forced on me and they are 100x louder than than what I was watching.
Old TV series should have closed captions available (which are apparently different from subtitles), however the question of where to obtain aside from VHS copies them might be difficult.
And of course, a lot of modern "dvd players" do not properly transmit closed captions as subtitles over HDMI, so that sure isn't helping
A slightly off topic but interesting video about this https://www.youtube.com/watch?v=OSCOQ6vnLwU
Many DVDs of old movies and TV shows may contain the closed captions, but they are not visible through HDMI. You have to connect your DVD player to your TV via the composite video analogue outputs.
This video explains all about it: https://youtu.be/OSCOQ6vnLwU
1 reply →
Last time I used Whisper with a foreign language (Chinese) video, I’m pretty sure it just made some stuff up.
The captions looked like they would be correct in context, but I could not cross-reference them with snippets of manually checked audio, to the best of my ability.
I tried whisper with a movie from the 60's and it was a disaster.
Not sure if it was due to the poor quality of the sound, the fact people used to speak a bit differently 60 years ago or that 3 different languages were used (plot took place in France during WW2).
I feel you. In the late 00's/early 10's, downloading and getting American movies were fairly easy but getting the subtitles was a challenge. It was even worse with movies from other regions. Even now I know people that record conversations to be replayed using Whisper so they can get 100% the info from it.
Disclaimer: I'm not praising piracy but outside of US borders is a free for all.
Using AI to generate subtitles is inventive. Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?
As someone who has started losing the higher frequencies and thus clarity, I have subtitles on all the time just so I don't miss dialogue. The only pain point is when the subtitles (of the same language) are not word-for-word with the spoken line. The discordance between what you are reading and hearing is really distracting.
This is my major peeve with my The West Wing DVDs, where the subtitles are often an abridgement of the spoken line.
> Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?
Yes, Whisper has been able to do this since the first release. At work we use it to live-transcribe-and-translate all-hands meetings and it works very well.
I don't think you are also including having AI lie of "hallucinating" to us which is an important point even if the article is only about having AI write code for an organization.
What is the relevance of this comment? The post is about LLMs in programming. Not about translation or NLP, two things transformers do quite well and that hardly anyone contests.
would be interesting if court transcriptions can be handled by these models.
Everything about that is true but, and that's a big BUT, the code I write with LLM I can only iterate on it with an LLM.
My mind doesn't develop a mental model of that code, I don't know where the relevant parts are, I can't quickly navigate through it and I have to reach the LLM for every small change.
Which is why I like Copilot style editing more than agents as a working model but agents are just so much more powerful and smarter thanks to everything available to them.
This is in line with "Programming As Theory Building"[0] (Peter Naur, 1985) that puts forward the idea that when one is building a system one builds a "theory" (in a sense a mapping of what, where and why) of that system that is essential for continued work on it. Once I internalized this idea it helped me make sense of why rewrites are not "never a good idea" and simultaneously why they are so fraught with peril, as well as what kind of onboarding actually makes sense for work on a system. It's also why I can never get on board with the passive "review code an LLM spits out" type of development; it just does not build theory as reliably, effectively and efficiently.
It's a good, very digestible paper, in my opinion; well worth the read.
[0] - https://gwern.net/doc/cs/algorithm/1985-naur.pdf
This is exactly right. This is what's happened with every vibe coded codebase I've made. As a result, I only selectively use LLMs now with sections of the codebase. Or I'll just relinquish control to the LLM. I think the latter will happen en masse, which is scary.
You can ask the llm to write code the way you think about it. I usually write a little spec document as my first prompt, and in there i'll add some rules on how I want code structured, seperated etc.
If you use cursor, cursorrules can be a good way of doing this.
This still doesn't work for me. I have to actively think about the problem, find a solution, and write the code myself. Outsourcing these efforts makes my learning ineffective.
I will keep using LLMs for 1 week greenfield hobby projects that I don't plan to revisit. But no matter how good LLMs get, I will never use them in my dailyjob, otherwise, I risk losing touch with the codebase.
> This was the craftsman’s ‘Golden Age’ and much time and trouble was taken over the design of tools. Craftsmen were being called upon to do more skilful and exacting work and the use of tools and the interest in development had become very widespread.
Above pulled from A Brief History of the Woodworking Plane [0]. A woodworking tool that has evolved over 2,000 years. Now there are electric planers, handheld electric planers and lots of heavy machinery that do the same thing in a very automated way. If a company is mass producing kitchen cabinets, they aren't hand planing edges on boards, a machine is doing all that work.
I feel like with AI we are on the cusp of moving beyond a "Golden age" and into an "industrial age" for coding, where it will become more important to have code that AI understands vs. something that is carefully crafted. Simple business pressure will demand it (whether we like it or not).
^ A comment I made just yesterday on a different thread.
For software developers AI is like the cabinet maker that gets a machine to properly mill and produce cabinet panels, sure you can use a hand plane to do that but you're producing a very different product and likely one that not many people will care about, possibly not even your employer when they see all the other wood shops pumping out cabinetry and taking their market share.
[0] https://www.handplane.com/879/a-brief-history-of-the-woodwor...
The difference is that programming is creating the design for the furniture and the process to assemble it using machines. Compilation and deployment is the factory in this case. So you're trying to automate away the design aspect where we already have the other aspects automated.
I think the hardest part is not spending the next 3 months of my life in a cave finishing all the hobby/side projects I didn't quite get across the line.
It really does feel like I've gone from being 1 senior engineer to a team that has a 0.8 Sr. Eng, 5 Jrs. and one dude that spends all his time on digging through poorly documented open source projects and documenting them for the team.
Sure I can't spend quite as much time working on hard problems as I used to, but no one knows that I haven't talked to a PM in months, no one knows I haven't written a commit summary in months, it's just been my AI doppelgangers. Compared to myself a year ago I think I now PERSONALLY write 150% more HARD code than I did before. So maybe, my first statement about being 0.8 is false.
I think of it like electric bikes, there seems to be indication that people with electric assist bikes actually burn more calories/spend more time/go farther on an electric bike than those who have manual bikes https://www.sciencedirect.com/science/article/abs/pii/S22141....
> I haven't written a commit summary in months
I don't know what you're posting, but if it's anything like what I see being done by GitHub copilot, your commit messages are junk. They're equivalent to this and you're wasting everyone's time:
Yeah I tried Copilot's automatic commit messages and they're trash, but the agent-based ones are much better.
Try Cubic, which is a Github add-on. Really good at writing GH commit messages and also surfaces bugs fairly reliably (adds PR comments). Not affiliated, just a user.
this behaviour is literally removable with proper prompting.
this is a strawmans argument... of whatever your are arguing
2 replies →
If you've ever run or been part of a team that does thorough, multi-party, pull request reviews you know what I am talking about.
The only part I don't automate is the pull request review (or patch review, pre-commit review, etc. before git.), thats always been the line to hold for protecting codebases with many contributors of varying capability, this is explicitly addressed in the article as well.
You can fight whatever straw man you want. Shadowbox the hypotheticals in your head, etc. I don't get all these recent and brand new accounts just straight up insulting and insinuating all this crap all over HN today.
1 reply →
For me, the electric bike analogy works differently: it enables people to ride, regularly, who would not be able to do that with traditional bikes. That's totally fine. But electric bikes don't threaten to take away our normal bikes.
By the reply count it appears this post was excellent at trolling HN's core audience :-). The author may be right, but his preface "look I'm one of you" was in a very different tone to the rest of his "now shut up and get to work" rant.
I do use the AI tools to some extent, if for no other reason than that they are currently the path of least resistance, and google+friends have lately played themselves out of the game.
He is probably right we should get acquainted with using agents before dismissing it :-).
The article is chefs_kiss.jpg. the author is core-hn, in spirit and in karma.
imo that tone is exactly what's needed to set right the entitlement that we SWEs have developed over the last decade. The key line in the article gets to the heart of the actual concern developers have imo.
As someone who has spent the better part of today fixing the utter garbage produced by repeated iteration with these supposedly magical coding agents, I'm neither in the camp of the "AI skeptic" (at least as defined by the author), nor am I in the camp of people who thinks these things can "write a large fraction of all the tedious code you’ll ever need to write."
Maybe I'm doing it wrong, but I seem to have settled on the following general algorithm:
* ask the agent to green-field a new major feature.
* watch the agent spin until it is satisfied with its work.
* run the feature. Find that it does not work, or at least has major deficiencies [1]
* cycle through multiple independent iterations with the agent, doing something resembling "code review", fixing deficiencies one at a time [2]
* eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.
Repeat.
It's not that the things are useless or "a fad" -- they're clearly very useful. But the people who are claiming that programmers are going to be put out of business by bots are either a) talking their book, or b) extrapolating wildly into the unknown future. And while I am open to the argument that (b) might be true, what I am observing in practice is that the rate of improvement is slowing rapidly, and/or the remaining problems are getting much harder to solve.
[1] I will freely grant that at least some of these major deficiencies typically result from my inability / unwillingness to write a detailed enough spec for the robot to follow, or anticipate every possible problem with the spec I did bother to write. T'was ever thus...
[2] This problem is fractal. However, it's at least fun, in that I get to yell at the robot in a way that I never could with a real junior engineer. One Weird Fact about working with today's agents is that if you threaten them, they seem to do better work.
Even on stuff it has no chance of doing on its own, I find it useful to basically git reset repeatedly and start with more and more specific instructions. At the very least it helps me think through my plan better.
Yeah...I've toyed with that, but there's still a productivity maximum where throwing it all away and starting from scratch is a worse idea, probabilistically, than just fixing whatever thing is clearly wrong.
Just to make it concrete, today I spent a few hours going through a bunch of HTML + embedded styles and removing gobs and gobs of random styles the LLMs glommed on that "worked", but was brittle and failed completely as soon as I wanted to do something slightly different than the original spec. The cycle I described above led to a lot of completely unnecessary markup, paired with unnecessary styles to compensate for the crappiness of the original DOM. I was able to refactor to a much saner overall structure, but it took some time and thinking. Was I net ahead? I don't really know.
Given that LLMs almost always write this kind of "assembled from StackOverflow" code, I have precisely 0% confidence that I'd end up in a better place if I just reset the working branch and started from scratch.
It kind of reminds me of human biology -- given billions of years of random evolution you can end up with incredible sophistication, but the end result will be incomprehensible and nearly impossible to alter.
> eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.
As it stands AI can't even get out of Lt Surge's gym in Pokemon Red. When an AI manages to beat Lance I'll start to think about using it for writing my code :-)
My workflow is similar. While the agent is running, I browse the web or day dream. If I'm lucky, the agent produced correct code (after possibly several cycles). If I'm not, I need to rewrite everything myself. I'm also not in any camp and I genuinely don't know if I'm more or less productive overall. But I think that a disciplined use of a well-integrated agent will make people more productive.
Results can vary significantly, and in my experience, both the choice of tools and models makes a big difference.
It’s a good idea to periodically revisit and re-evaluate AI and tooling. I’ve noticed that many programmers tried AI when, for example, GPT-3.5 was first released, became frustrated, and never gave it another chance—even though newer models like o4-mini are now capable of much more, especially in programming tasks.
AI is advancing rapidly. With the latest models and the right tools, what’s possible today far exceeds what was possible even just a short time ago (3-12 months ago even).
Take a look at Cursor or Windsurf or Roo code or aider to "feed" AI with code and take a look at models like Google Gemini 2.5 Pro, Claude Sonnet 4, OpenAI o4mini. Also educate yourself about agents and MCP. Soon that will be standard for many/every programmer.
I am using all of the models you're talking about, and I'm using agents, as I mentioned.
There is no magic bullet.
Which model? Are you having it write unit tests first? How large of a change at a time are you asking for? How specific are your prompts?
[flagged]
Absurd take. Speed is not the issue! Optimising for speed of production is what got us into the utter quagmire that is modern software.
Lack of correctness, lack of understanding and ability to reason about behaviour, and poor design that builds up from commercial pressure to move quickly are the problems we need to be solving. We’re accelerating the rate at which we add levels to a building with utterly rotten foundations.
God damn it, I’m growing to loathe this industry.
2 replies →
Why is this line of thinking so common with AI folk? Is it just inconceivable to you that other people have different experiences with a technology that has only become widespread in the past couple years and that by its very nature is non deterministic?
4 replies →
As if going faster is the only goal of a programmer.
Some simulation I worked on for 2 months were in total 400 lines of code. Typing it out was never the bottleneck. I need to understand the code so that when I am studying the code for the next 1 1/2 months I can figure out if the problem is a bug in my code, or the underlying model is wrong.
Or they work with languages, libraries, systems or problem areas where the LLMs fail to perform anywhere near as well as they do for you and me.
6 replies →
The thing is, the AI tools are so easy to use and can be picked up in a day or too by an experienced programmer without any productivity loss
I don't get why people push this LLM fomo. The tools are evolving so fast anyways
1 reply →
I simply do not trust accounts made after 2020, let alone last year.
I've not yet been in a position where reading + cleaning up the LLMs bad code was faster and/or produced better code than if I wrote it by hand. I've tried. Every time someone comes up and says "yeah of course you're not using GPT4.7-turbo-plus-pro" I go and give a spin on the newfangled thing. Nope, hasn't happened yet.
I admit my line of work may not be exactly generic crud work, but then again if it's not useful for anything just one step above implementing a user login for a website or something, then is it really gonna take over the world and put me out of a job in 6 months?
5 replies →
It is absolutely hilarious to read the responses from people who can’t use ai make attempts to justify their ability to code better than ai. These are the people who will be replaced. They are fighting so hard against it instead of learning how to use it.
“I wrote 400 lines of code I don’t understand and need months to understand it because ai obviously cant understand it or break it down and help me document it”
“Speed is what caused problems! Because I don’t know how to structure code and get ai to structure it the same it’s obviously going rogue and doing random things I cannot control so it’s wrong and causing a mess!!!”
“I haven’t been able to use it properly so don’t know how to rein it in to do specific tasks so it produces alot of stuff that takes me ages to read! I could have written it faster!!!”
I would love to see what these people are doing 1-2 years from now. If they eventually click or if they are unemployed complaining ai took their jobs.
6 replies →
I suspect a large proportion of claims made for productivity increases are skewed by the fact that the speed at which code is produced by AI makes you _feel_ productive, but these gains are largely replaced by the effort to understand, refactor, review and clean up the code. The high that you get when something "works" tends to stick more in your memory than the time when you had to spend a day cleaning up dead code, refactoring 2k line modules into a more readable project structure etc.
I'm not saying that AI can't make you productive, it's just that these claims are really hard to verify. Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May), producing a single file with >2k lines. Is that going to be harder to maintain than a codebase with a proper project structure that's easily parseable by a human?
> Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May)
This is incorrect. The library was part of the MCP framework we launched on March 25 -- the same month development began:
https://blog.cloudflare.com/remote-model-context-protocol-se...
Indeed the speed with which we were able to turn this around was critical to us, as it allowed us to have our Remote MCP framework ready immediately when the spec was finalized, which led to quite a few companies building MCP servers on Cloudflare: https://blog.cloudflare.com/mcp-demo-day/
I'm not an AI maximalist. I still write lots of code by hand, because there's a lot AI isn't good at. It's good at boilerplate and straightforward code, it's bad at refactoring deep systems. But AI assistance was undeniably a huge win for the OAuth project. There's no way I could have written that library by hand so quickly. (Maybe when I was 25 and had no responsibilities, but these days I have like 1 solid day a week to actually write code...)
First commit: Feb 27th 2025
commit 3b2ae809e9256d292079bb15ea9fe49439a0779c Author: Kenton Varda <kenton@cloudflare.com> Date: Thu Feb 27 17:04:12 2025 -0600
1 reply →
Apologies, I didn't mean to misrepresent your work. Big fan of your work by the way, I was a happy user of sandstorm.io back in the day.
Ok sorry to get abstract but to me what you are talking about is differentiating between understanding and correctness. We as humans, for now, need to understand the code and that's not easily transmitted from the output of some AI. In fact, that's a hard problem. But I don't think it's impossible for AI to assist humans with that. The AI could help walk humans through the code to understand quickly what's going on. Maybe ultimately the issue here is trust. Do we trust the AI to write code. Maybe we spend more time trying to verify it for now. I think that shows we place a lot of trust in humans to write code. Maybe that changes.
This is cope. I know my own experience, and I know the backgrounds and problem domains of the friends I'm talking to that do this stuff better than I do. The productivity gains are real. They're also intuitive: if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.
If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.
We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!
Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.
Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.
> if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
No senior-level engineer worth their salt, and in any kind of minimally effective organization, is spending any meaningful amount of their time doing the rote repetition stuff you're describing here. If this is your experience of work then let me say to you very clearly: your experience is pathological and non-representative and you need to seek better employment :)
The thing I don't understand is that you keep bringing up your friends' experience in all your responses and in the blog itself. What about your experience and your success rate and productivity gain that you observed with AI agent? It feels like you yourselves aren't confident on your gain and must bring up second hand experience from your friends to prop up your arguments
Regardless of what people say to you about this, most (all?) undergraduates in CS programs are using LLMs. It's extremely pervasive. Even people with no formal training are using AI and vercel and churning out apps over the weekend. Even if people find reasons to dislike AI code writing, culturally, it's the future. I don't see that changing. So either a huge percent of people writing code are doing it all wrong or times are changing.
Just a data point:
I think it has a lot to do with the type of work you are doing. I am a couple of years into a very small startup that has some actual technology built (as opposed to a really simple CRUD app or something).
When I am working on the front-end where things are pretty simple AI is a huge speed up. What it does VERY well it latch on to patterns and then apply those patterns to other things. If it has a couple of examples you can point it to and say "ok build that but over here" the newest revisions of Claude and Gemini are perfectly capable of building the whole thing end to end. Because it's a fairly repetitive task I don't have to spend much time untangling it. I can review it and pattern match against things that don't look right and then dive into those.
For a real example, I needed a page for a user to manually add a vendor in our platform. A simple prompt asking Claude to add a button to the page sent into a mode where it added the button, built the backend handler, added the security checks, defined a form, built another handler to handle the submitted data, and added it to the database. It even wrote the ACL correctly. The errors it introduced were largely around using vanilla HTML in place of our standard components and some small issues with how it attempted to write to the DB using our DB library. This saved me a couple of hours of typing.
Additionally if I need to refactor something AI is a godsend. Just today an underlying query builder completely changed its API and broke..everything. Once I identified how I wanted to handle the changes and wrote some utilities I was able to have Claude just find everything everywhere and make those same changes. It did it with like 90% accuracy. Once again that saved me a couple of hours.
Where it fails, usually spectacularly, is when we get to the stuff that is new or really complex. If it doesn't have patterns to latch onto it tries to invent them itself and the code is garbage. Rarely does it work. Attempting to vibe code it with increasingly more pointed prompts will often result in compiling code but almost never will it do the thing I actually wanted.
In these contexts it's usefulness is mostly things like "write a sql query to do X" which occasionally surfaces a technique I hadn't thought about.
So my experience is pretty mixed. I am definitely saving time. Most of it is typing time not thinking time. Which is like 1/3 of my average day. If I had to guess I am somewhere in the neighborhood of 30-40% faster today than I was in 2019. Notably that speed up has allowed me to really stretch this funding round as we are well past the phase where we would have typically hired people in my past companies. Usually someone relatively mid-level to take over those repetitive tasks.
Instead it's just me and a non-technical founder going along super quickly. We will likely be at a seed round before anyone new comes in.
> But all day, every day, a sizable chunk of the front page of HN is allocated to LLMs: incremental model updates, startups doing things with LLMs, LLM tutorials, screeds against LLMs. It’s annoying!
You forgot the screeds against the screeds (like this one)
Let's just imagine we're critiquing cars or planes in about 1910. They're so flawed. No one can say with certainty whether or how they will improve.
Side note of interest, from Wikipedia: "Flying Machines Which Do Not Fly" is an editorial published in the New York Times on October 9, 1903. The article incorrectly predicted it would take one to ten million years for humanity to develop an operating flying machine.
Conversely, imagine we're someone talking about colonising the moon in 1969.
I think we have heated debate because most people don't explain what's their `cost function`. For someone when they talk about AI they take a binary reasoning that if something is not perfect today then it will never be perfect or will never improve. For other is just they see something is useful today and know it will get better next year and don't have expectation of getting AGI.
In your reply it's equivalent of someone expecting AGI in next decade. The same is when people talk about if AI will take software dev jobs. Some just see all the flows in AI and they know they job is secure. Some other see that they are 2x productive and potentially your team mate not needed anymore. If AI can eliminate 50% of IT jobs in ~10-20 years then thats still job replacement. When we replaced horses with cars that doesn't mean we have no horses today or that nobody ride horses.
1 reply →
If we had funded that as much as we're currently funding AI I think it would have been a plausible goal. Keep in mind we cut more than half of NASA's budget after we first landed.
9 replies →
It's a good idea to think of a converse situation, but this is a bad example. The constraint was not about technology but about budget, perceived benefits and political will.
1 reply →
I'm mostly skeptical about AI capabilities but I also think it will never be a profitable business. Let's not forget AI companies need to recoup a trillion dollars (so far) just to break even [1].
VCs are already doubting if the billions invested into data centers are going to generate a profit [1 and 2].
AI companies will need to generate profits at some point. Would people still be optimistic about Claude etc if they had to pay say $500 per month to use it given its current capabilities? Probably not.
So far the only company generating real profits out of AI is Nvidia.
[1] https://www.goldmansachs.com/insights/articles/will-the-1-tr...
[2] https://www.nytimes.com/2025/06/02/business/ai-data-centers-...
Some of the revenues are very real. A few million subscriptions at tens-hundreds of dollar per month add up to non trivial revenue pretty quickly. I think most software engineers will have such subscriptions fairly soon. We're talking about a market where companies drop 5-20K/month on software engineers and hire whole teams of those. Of course they are going to spend on this. Maybe not 500$. But 20-100$ is a lot less controversial.
And this is quickly spreading beyond software engineering. Software engineers are just being guinea pigs for agentic AIs eventually popping up in all sectors. Basically, while security and quality issues are being sorted out, it helps having users that are a bit more clued in about what they are doing.
That's why AI investments are so hot right now. Of course there are a lot of AI companies that will fall short. There always are. And companies like Nvidia that will make a lot of money selling GPUs.
But there is some non trivial amount of revenue potential there. Anybody still in denial about that is probably penny pinching.
I refuse to pay for any product where I'm the product. Run it locally, maybe I'll pay for it but never online.
My guess would also be that at the 100$ price point only one company can be profitable but that is just a very wild guess.
For what I've seen SE's are some of the last to adopt it. My marketing colleague has been overflowing in generic AI crap produced by external writers for over a year now.
2 replies →
Right now Google, OpenAI, Anthropic, etc are playing a game of chicken. But even after that it's not clear if the remaining players will be able to generate a profit.
Even if AI companies can recoup the billions or trillions of dollars invested, how long will that take and what will be their margin after that?
Hence the flurry of articles, controversial interviews and aggressive marketing and the pitch to replace all white color jobs.
What about the free open weights models then? And the open source tooling to go with them?
Sure, they are perhaps 6 months behind the closed-source models, and the hardware to run the biggest and best models isn't really consumer-grade yet (How many years could it be before regular people have GPUs with 200+ gigabytes vram? That's merely one order of magnitude away).
But they're already out there. They will only ever get better. And they will never disappear due to the company going out of business or investors raising prices.
I personally only care about the closed sourced proprietary models in so far as they let me get a glimpse of what I'll soon have access to freely and privately on my own machine. Even if all of them went out of business today, LLMs would still have a permanent effect on our future and how I'd be working.
How do you plan on running those open weight models then? You need a bunch of very expensive hardware to even slightly "good" performance.
Will you be buying/renting the hardware needed to run these models with a decent performance?
I can guarantee +99.999% of users won't be doing that.
> but the plagiarism
This entire section reads like, oddly, the reverse of the "special pleading" argument that I usually see from artists. Instead of "Oh, it's fine for other fields, but for my field it's a horrible plagiarism machine", it's the reverse: "Oh, it's a problem for those other fields, but for my field get over it, you shouldn't care about copyright anyway".
I'm all for eliminating copyright. The day I can ignore the license on every single piece of proprietary software as I see fit, I'll be all for saying that AIs should be able to do the same. What I will continue to complain about is the asymmetry: individual developers don't get to violate individual licenses, but oh, if we have an AI slurp up millions of codebases and ignore their licenses, that's fine.
No. No, it isn't. If you want to ignore copyright, abolish it for everyone. If it still applies to everyone else, it should still apply to AIs. No special exceptions for mass-scale Open Source license violations.
I think where tptacek is right, though, is that if we're going to hold this position without hypocrisy, then we need to respect copyright as long as it exists. He's right that many of us have not done that; it's been very common to violate copyright for mere entertainment. If we want the licenses of our own work to be respected, then we need to extend that respect to others as well, regardless of the size of the copyright holder.
There are things that "modulate" this. Violating copyright is never right, of course, some questions are however scale, and purpose. Taking others' creative output, unlicensed, for large-scale commercial gain, is about the worst.
The whataboutism of that section was odd. The only non-handwavy argument presented is that due to the scale of LLM training that models' output should be treated like US-specific typeface forms' copyright, ie: non-applicable.
It's interesting as typeface plagiarism became rampant beginning in the 70s when more accurate photo reproductions made it trivial. This was problematic for designers wanting to make a livelihood, which is something ITC sought to mitigate by better up-front payments (IIRC from U&lc's coverage) to incentivize quality typeface creation.
There's a distinction though between literal plagiarism and just inspiration from elements. US copyright law doesn't protect either for typeface forms but ironically it does allow copyright for the code used in font files.
I've seen OpenAI's o3-mini (their reasoning model) suggest verbatim code and comments that I found on Github predating LLMs by years. It seems the more times the same code and comment appears online the more likely this is to occur. I'd imagine there would be studies looking into the scope and frequency this occurs and how much is considered fair use.
> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge!
How does this section fit in with the agent section just after? In an agentic model, isn't the merge getting done by either the model or a tool, and the retry-loops on failures would be mostly invisible?
E.g. when using Aider + Gemini Flash 2.5, probably 90% of the changes apply cleanly from my perspective (maybe half actually apply cleanly, the other half after a couple of roundtrips of Aider telling the model that the patch didn't apply). The 10% that only apply partially I usually throw away and redo the prompt, it's really rare that I start merging the code manually.
As a junior, I feel most complexity in software is manufactured. LLMs simplify that mess for me, making it easier to get things done. But I’m constantly hit with imposter syndrome, like I’m less skilled because I rely on AI to handle the tricky stuff. And Gemini is better than me!
A big problem is that you're either hearing breathless over the top insane hype (or doomerism, which is breathless over the top hype taken to a dark place) or skepticism that considers AI/LLMs to be in the same league as NFTs.
Neither of these is accurate, but I guess nuanced thinking or considering anything below surface vibes is out these days.
So far after playing with them I'm using them as:
1. A junior intern that can google really really fast and has memorized a large chunk of the Internet and the library, and can do rough first-pass research and dig for things.
2. Autocomplete 2.0 that can now generate things like boilerplate or fairly pedestrian unit tests.
3. Rubber duck debugging where the rubber duck talks back.
4. A helper to explain code, at least for a first pass. I can highlight a huge piece of code and ask it to summarize and then explain and walk me through it and it does a passable job. It doesn't get everything right but as long as you know that, it's a good way to break things down and get into it.
For those things it's pretty good, and it's definitely a lot of fun to play with.
I expect that it will get better. I don't expect it to replace programmers for anything but the most boring mindless tasks (the ones I hate doing), but I expect it to continue to become more and more useful as super-autocomplete and all the other things I listed.
Fully agree with @tptacek here:
> But AI is also incredibly — a word I use advisedly — important. It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.
However, I just don't think the AI coding part is that interesting or future-thinking. We're seeing so much more progress in semantic search, tool calling, general purpose uses, robotics, I mean, DeepMind just won a Nobel for goodness' sake.
Don't get me wrong, I use ChatGPT to write all kinds of annoying boilerplate, and it's not too bad at recalling weird quirks I don't remember (yes, even for Rust). But hard problems? Real problems? Zero shot. Novel problems? No way.
> But I’ve been first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months.
I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
Listen, I also think we're on the precipice of re-inventing how we talk to our machines; how we automate tasks; how we find and distribute small nuggets of data. But, imo, coding just ain't it. Donald Knuth calls computer programming an art, and to rob humanity of effecting not just coding—but any art, I'd argue—would be the most cardinal of sins.
> I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
I have this problem with a lot of LLM miracle anecdotes. There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.
>There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.
On the other hand, we've all probably had the experience of putting out a fire and wanting time to track down an issue only to be told to not bother since "everything is working now". Sometimes you spend months complaining about something because the people you're complaining to don't have the time to dive into an issue. Even if it would have taken mere hours for a human to hunt down the issue, someone still has to be given those hours to work on it. By contrast, copying a bunch of logs into an AI is nearly free.
> You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.
Eh, I've worked on projects where because of un-revisited logging decisions made in the past, 1-10k error logs PER MINUTE were normal things to see. Finding the root cause of an issue often boiled down to multiple attempts at cleaning up logs to remove noise, cleaning up a tangentially related issues and waiting for it to happen again. More than one root cause was discovered by sheer happenstance of looking at the right subset of the logs at the right moment in time. I can absolutely buy that a system built for parsing large amounts of text and teasing patterns out of that found in minutes what humans could not track down over months.
One point is missing for me - you get lazy. People are less and less equiped to think about complex problems the more they RELY on this for coding. Also this article is just about coding.
Not sure why more people aren't mentioning this. But that's the exact reason I've stopped using any LLM stuff. Your thinking just gets lazier and lazier the more you use it. You stop thinking the problems deeply and thoroughly when the "answer" is just around the corner. Then one day you snap out of that mirage, and see what a boatload of junk you've just created with this stuff.
I also dread what's going to happen to the younger generations which have this to replace their real education, but that's a bit offtopic here.
> My AI skeptic friends are all nuts
My CEO friends that overstate the efficacy of LLMs are all nuts.
(I say this as someone who has been working in "AI" as part of Digital transformation roadmaps for SEVEN years and as someone who has used LLM's daily in my workflow for 3 years).
I think both of those statements are true at the same time.
>but it’s bad at rust
I have to say, my ability to learn Rust was massively accelerated via LLMs. I highly recommend them for learning a new skill. I feel I'm roughly at the point (largely sans LLMs) now where I can be nearly as productive in Rust as Python. +1 to RustRover as well, which I strongly prefer to any other IDE.
The interesting question is: is it really that bad at Rust, or does Rust's strict compiler just catch more errors which remain hidden in, let us say, Go? The usual hand-waving response is that developers should write more tests instead - as if a boring and tedious task such as writing tests will not be passed to LLM.
I have to say, my ability to learn Rust was massively accelerated via LLMs.
How would you know?
If you didn't know Rust already, how would you know the LLM was teaching you the right things and the best way to do things?
Just because it compiles doesn't mean it works. The world is full of bad, buggy, insecure, poor code that compiles.
I've been writing Rust code in production for 4+ years, and I can write Rust pretty well, and I've learned a lot from using chatgpt and co-pilot/cursor.
In particular, it helped me write my first generic functions and macros, two things that were pretty intimidating to try and get into.
1 reply →
How does anyone self learning know they're learning the "right things and the best way to do things"? By putting the stuff they've learned into practice and putting it up against the real world. How many Rust tutorials are out there that teach things incorrectly, non-idomatically or just inefficiently? How does anyone not already an expert know except by trying it out?
No only this, but I would challenge the OP to see if he really knows Rust but turning off LLM and see “how much you truly know”.
This is akin to be on tutorial hell and you “know the language “
1 reply →
It is not bad at rust. I don't think I could even function well as a Rust programmer without chatgpt and now Cursor. It removes a lot of the burden of remembering how to write generic code and fixing borrow checking stuff. I can just write a generic function with tons of syntax errors and then tell cursor to fix it.
Me too -- actually, I'd say that the LLMs I use these days (Sonnet 4 and GPT4.1, o4, etc) are pretty good at rust.
How much of that proficiency remains once you switch it off?
Quite a lot, but hey, feel free to put me to the test
Not to derail, but NFT mania (part of the opening salvo in the article) was the giant shitshow that it was -not- because the concept of unique digital bits in the possession of a single owner was a bad idea (or, the concept of unique verification of membership in a club was a bad idea) -- it was a diarrhea-slicked nightmare because it was implemented via blockchains and their related tokens, which inherently peg fluctuating fiat value to the underlying mechanisms of assigning and verifying said ownership or membership, and encourages a reseller's market .. not to mention the perverse, built-in economic incentives required to get nodes to participate in that network to make the whole thing go.
Had NFTs simply been deployed as some kind of protocol that could be leveraged for utility rather than speculation, I think the story would be a complete 180. No clue personally how to achieve that, but it feels like it could be done.. except that, too, would have been completely perverted and abused by centralized behemoths, leading to a different but terrible outcome. Can you imagine if all data became non-fungible? Convince all the big identity vendors (Google, Apple, etc) to issue key pairs to users that then get used by media companies to deliver audio and video keyed only to you that's embedded with maybe some kind of temporal steganographic signature that's hard to strip and can be traced back to your key? It's not just cracking AACS once and copying the bytes. It becomes this giant mess of you literally can't access anything without going through centralized authorities anymore. Then build more anti-patterns on top of that lol. Prolly better that it was mostly just monkey JPEGs and rug pulls.
Anyways, I'm so far off topic from what's actually being discussed -- just couldn't help myself from veering into left field.
Hmm a bit of a tone shift from: https://fly.io/docs/about/open-source/
> Fly.io builds on the work of a tremendous open source infrastructure community. We want open source authors to benefit from their work because we think a healthy, thriving open source ecosystem will help us build better products.
To the fine article:
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass.
Imho, the fly.io people are largely performative HN attention seekers that practice writing posts to produce engagement. It's almost charming in the way it resembles the early-on blogosphere, and you should read them as token summaries of "this is a popular sentiment right now, framed to appear a bit wise and edgy". They're often wordy versions of "that one trick they don't want you to know" or "am I really the only one that [actually quite popular thing]".
In the sense that the posts distill whatever HN at the time would accept as thought leadership it's a good pulse check on the community.
It's rather shocking for a company to admit to flaunting software licensing in public communications.
But this is not new https://fly.io/blog/wrong-about-gpu/
> At one point, we hex-edited the [NVIDIA] closed-source drivers to trick them into thinking our hypervisor was QEMU.
There's a huge caveat i don't see often, which is that it depends on your language for programming. IE. AI is reallllly good at writing Next.js/Typescript apps, but not so much Ruby on Rails. YMMV
I agree with this. People who are writing Python, Javascript, or Typescript tell me that they get great results. I've had good results using LLMs to flesh out complex SQL queries, but when I write Elixir code, what I get out of the LLM often doesn't even compile even when given function and type specs in the prompt. As the writer says, maybe I should be using an agent, but I'd rather understand the limits of the lower-level tools before adding other layers that I may not have access to.
My hunch is that to exploit LLMs one should lean on data driven code more. LLMs seem to have a very easy time to generate data literals. Then it's far less of an issue to write in a niche language.
Not familiar with Elixir but I assume it's really good at expressing data driven code, since it's functional and has pattern matching.
I think for some languages like Clojure and Elixir, it's just so easy to get to the level of abstraction you need to write your business logic that everyone does so. So the code does not have any commonalty with each other. Even when using the same framework/library.
But for Python, JS, etc,... it's the same down to earth abstraction that everyone is dealing with, like the same open a file, parse a csv, connect to the database patterns.
I am finding the most destructive aspect of LLM assistance to be the loss of flow state.
Most of the time I can go faster than these tools if I have confidence in myself and allow the momentum to build up over the course of 20-30 minutes. Every time I tab out to an LLM is like a 5 minute penalty over what I could have done unaided on a good day.
Getting the model prepared to help you in a realistic domain often takes a few minutes of arranging code & comments so that it is forced toward something remotely sane. I'll scaffold out entire BS type hierarchies just so I can throw a //TODO: .... line in the middle somewhere. Without this kind of structure, I would be handling unfiltered garbage most of the time.
It's not that these tools are bad, it's that we need to recognize the true cost of engaging with them. ChatGPT is like a jackhammer. It will absolutely get you through that concrete slab. However, it tends to be quite obnoxious & distracting in terms of its operational principles.
> I am finding the most destructive aspect of LLM assistance to be the loss of flow state.
Finally someone else mentions Flow!
I find the opposite, I can stay in flow state much longer than I could before because tabbing out to an LLM is a tighter loop than me going out to google and searching through sites for smaller tidbits that I have to assemble and get through, rather than a full solution that might not be 100% correct, but most of the time I can get to the working solution quicker.
I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere. He acknowledged that the tools need an expert to use properly, and as he illustrated, he refined his expertise over many years. He is of the first and last generation of experienced programmers who learned without LLM assistance. How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase? I can almost anticipate an interjection along the lines of "well we used to build everything with our hands and now we have tools etc, it's just different" but this is an order of magnitude different. This is asking a robot to design and assemble a shed for you, and you never even see the saw, nails, and hammer being used, let alone understand enough about how the different materials interact to get much more than a "vibe" for how much weight the roof might support.
I think the main difference between shortcuts like "compilers" and shortcuts like "LLMs" is determinism. I don't need to know assembly because I use a compiler that is very well specified, often mathematically proven to introduce no errors, and errs on the side of caution unless specifically told otherwise.
On the other hand, LLMs are highly nondeterministic. They often produce correct output for simple things, but that's because those things are simple enough that we trust the probability of it being incorrect is implausibly low. But there's no guarantee that they won't get them wrong. For more complicated things, LLMs are terrible and need very well specified guardrails. They will bounce around inside those guardrails until they make something correct, but that's more of a happy accident than a mathematical guarantee.
LLMs aren't a level of abstraction, they are an independent entity. They're the equivalent of a junior coder who has no long term memory and thus needs to write everything down and you just have to hope that they don't forget to write something down and hope that some deterministic automated test will catch them if they do forget.
If you could hire an unpaid intern with long term memory loss, would you?
Determinism is only one part of it: predictability and the ability to model what it’s doing is perhaps more important.
The physics engine in the game Trackmania is deterministic: this means that you can replay the same inputs and get the same output; but it doesn’t mean the output always makes sense: if you drive into a wall in a particular way, you can trigger what’s called an uberbug, where your car gets flung in a somewhat random direction at implausibly high speed. (This sort of thing can lead to fun tool-assisted speedruns that are utterly unviable for humans.)
The abstractions part you mention, there’s the key. Good abstractions make predictable. Turn the steering wheel to the left, head left. There are still odd occasions when I will mispredict what some code in a language like Rust, Python or JavaScript will do, but they’re rare. By contrast, LLMs are very unpredictable, and you will fundamentally never be able to mentally model what it achieves.
1 reply →
Exactly. Compilers etc. are like well-proven algebraic properties, you can build on them and reason with them and do higher level math with confidence. That's a very different type of "advancement" than what we're seeing with LLMs.
> If you could hire an unpaid intern with long term memory loss, would you?
It's clearly a deficiency. And that's why one of the next generations of AIs will have long term memory and online learning. Although even the current generation of the models shows signs of self-correction that somewhat mitigate the "random walk" you've mentioned.
2 replies →
It's not just one unpaid intern with long term memory loss, it's several of them. And they don't need breaks.
If you could hire an army of unpaid interns with long term memory loss who work 24/7, would you?
1 reply →
What do you think the "mistake" is here?
It seems like you're pointing out a consequence, not a counter argument.
There’s a really common cognitive fallacy of “the consequences of that are something I don’t like, therefore it’s wrong”.
It’s like reductio ad absurdum, but without the logical consequence of the argument being incorrect, just bad.
You see it all the time, especially when it comes to predictions. The whole point of this article is coding agents are powerful and the arguments against this are generally weak and ill-informed. Coding agents having a negative impact on skill growth of new developers isn’t a “fundamental mistake” at all.
Exactly.
What I’ve been saying to my friends for the last couple of months has been, that we’re not going to see coding jobs go away, but we’re going to run into a situation where it’s harder to grow junior engineers into senior engineers because the LLMs will be doing all the work of figuring out why it isn’t working.
This will IMO lead to a “COBOL problem” where there are a shortage of people with truly deep understanding of how it all fits together and who can figure out the line of code to tweak to fix that ops problem that’s causing your production outage.
I’m not arguing for or against LLMs, just trying to look down the road to consequences. Agentic coding is going to become a daily part of every developer’s workflow; by next year it will be table stakes - as the article said, if you’re not already doing it, you’re standing still: if you’re a 10x developer now, you’ll be a 0.8x developer next year, and if you’re a 1x developer now, without agentic coding you’ll be a 0.1x developer.
It’s not hype; it’s just recognition of the dramatic increase in productivity that is happening right now.
> How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
LLM's are so-so coders but incredible teachers. Today's students get the benefit of asking copying and pasting a piece of code into an LLM and asking, "How does this work?"
There's a lot of young people that will use LLM's to be lazy. There's also a lot that will use them to feed their intellectual curiosity.
Many of the curious ones will be adversely affected.
When you're a college student, the stakes feel so high. You have to pass this class or else you'll have to delay graduation and spend thousands of dollars. You have to get this grade or else you lose your grant or scholarship. You want to absorb knowledge from this project (honestly! you really do) but you really need to spend that time studying for a different class's exam.
"I'm not lazy, I'm just overwhelmed!" says the student, and they're not wrong. But it's very easy for "I'm gonna slog through this project" to become "I'm gonna give it a try, then use AI to check my answer" and then "I'm gonna automate the tedious bits that aren't that valuable anyway" and then "Well I'll ask ChatGPT and then read its answer thoroughly and make sure I understand it" and then "I'll copy/paste the output but I get the general idea of what it's doing."
Is that what students will do, though? Or will they see the cynical pump and dump and take the shortcuts to get the piece of paper and pass the humiliation ritual of the interview process?
1 reply →
I'm hearing this fear more frequently, but I do not understand it. Curriculum will adapt. We are a curious and intelligent species. There will be more laypeople building things that used to require deep expertise. A lot of those things will be garbage. Specialists will remain valuable and in demand. The kids will still learn to write loops, use variables, about OOP and functional programming, how to write "hello world," to add styles, to accept input, etc. And they'll probably ask a model for help when they get stuck, and the teacher won't let them use that during a test. The models will be used in many ways, and for many things, but not all things; it will be normal and fine. Developing will be more productive and more fun, with less toil.
>How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
Dunno. Money is probably going to be a huge incentive.
I see the same argument everywhere. Like animators getting their start tweening other peoples content. AI is great at tweening and likely to replace farms of juniors. But companies will need seniors to direct animation, so they will either have to pay a lot of money to find them or pay a lot of money to train them.
Well this is actually happening in Japanese Animation and the result is that no young talents are getting trained in the workforce. [1]
But unlike animation, where the demand for the art can just disappear. I don't think the demand for software engineer will disappear. Same thing with musician. Young engineers might just be jobless or on training mode for much longer period of time before they can make actual living money.
Good thing is, as far as I know, Kyoto Animation managed to avoid this issue by having in-house training, growing their own talent pools.
[1]: https://blog.sakugabooru.com/2023/03/31/the-long-quest-to-fi...
Expecting commercial entities to engage in long term thinking when they can not do that and reduce costs in the next financial quarter is a fools game.
I think what you've said is largely true, but not without a long period of mess in between.
1 reply →
> get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?
I wonder this too, as someone who is entirely self-taught, when I started escaping “tutorial hell” was the hardest part of the journey, and took quite a bit of both encouragement and sheer willpower. Not sure I would have ever went beyond that if I had LLMs.
I worry for Juniors, and either we’ll need to find a way to mentor them past the vibe coding phase, or we hope that AI gets good enough before we all retire.
There will always be people that manage to get into the guts of something.
All AI is going to do is create a new class of programmer, such that the people who know the details will end up being more valuable.
I wonder if that will make the great generation of human coders. Some of our best writers were the generation that spanned between oral education and mass production of books. Late generations read and wrote, rather than memorized and spoke. I think that was Shakespeare’s genius. Maybe our best coders will be supercharged with AI, and subsequent ones enfeabled by it.
Shakespeare was also popular because he was published as books became popular. Others copied him.
I suppose the counterargument is, how many experienced programmers today have seen a register or a JMP instruction being used?
Quite a lot of the good programmers I have worked with may never have needed to write assembly, but are also not at all confused or daunted by it. They are curious about their abstractions, and have a strong grasp of what is going on beneath the curtain even if they don't have to lift it all that often.
Most of the people I work with, however, just understand the framework they are writing and display very little understanding or even curiosity as to what is going on beneath the first layer of abstraction. Typically this leaves them high and dry when debugging errors.
Anecdotally I see a lot more people with a shallow expertise believing the AI hype.
The difference is that the abstraction provided by compilers is much more robust. Not perfect: sometimes programmers legitimately need to drop into assembly to do various things. But those instances have been rare for decades and to a first approximation do not exist for the vast majority of enterprise code.
If AI gets to that level we will indeed have a sea change. But I think the current models, at least as far as I've seen, leave open to question whether they'll ever get there or not.
It's pretty common for CS programs to include at least one course with assembly programming. I did a whole class programming controllers in MIPS.
I would assume at least the ones that did a formal CS degree would know JMP exists.
Your compiler does not hallucinate registers or JMP instructions
13 replies →
Agree. We'll get a new breed of programmer — not shitty ones — just different. And I am quite sure, at some point in their career, they'll drop down to some lower level and try to do things manually.... Or step through the code and figure out a clever way to tighten it up....
Or if I'm wrong about the last bit, maybe it never was important.
Counter-counterargument; You don't need to understand metalworking to use a hammer or nails, that's a different trade, though an important trade that someone else does need to understand in order for you to do your job.
If all of mankind lost all understanding of registers overnight, it'd still affect modern programming (eventually)
Anyone that's gotten a CS degree or looked at godbolt output.
Not really a counter-argument.
The abstraction over assembly language is solid; compilers very rarely (if at all) fail to translate high level code into the correct assembly code.
LLMs are nowhere near the level where you can have almost 100% assurance that they do what you want and expect, even with a lot of hand-holding. They are not even a leaky abstraction; they are an "abstraction" with gaping holes.
Registers: All the time for embedded. JMP instruction? No idea what that is!
Probably more than you might think.
As a teen I used to play around with Core Wars, and my high school taught 8086 assembly. I think I got a decent grasp of it, enough to implement quicksort in 8086 while sitting through a very boring class, and test it in the simulator later.
I mean, probably few people ever need to use it for something serious, but that doesn't mean they don't understand it.
Feels like coding with and without autocomplete to me. At some point you are still going to need to understand what you are doing, even if your IDE gives you hints about what all the functions do.
Sure, it's a different level, but it's still more or less the same thing. I don't think you can expect to learn how to code by only ever using LLMs, just like you can't learn how to code by only ever using intellisense.
> I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere
Some of the arguments in the article are so bizarre that I can’t believe they’re anything other than engagement bait.
Claiming that IP rights shouldn’t matter because some developers pirate TV shows? Blaming LLM hallucinations on the programming language?
I agree with the general sentiment of the article, but it feels like the author decided to go full ragebait/engagement bait mode with the article instead of trying to have a real discussion. It’s weird to see this language on a company blog.
I think he knows that he’s ignoring the more complex and nuanced debates about LLMs because that’s not what the article is about. It’s written in inflammatory style that sets up straw man talking points and then sort of knocks them down while giving weird excuses for why certain arguments should be ignored.
They are not engagement bait. That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.
A lot of people are misunderstanding the goal of the post, which is not necessarily to persuade them, but rather to disrupt a static, unproductive equilibrium of uninformed arguments about how this stuff works. The commentary I've read today has to my mind vindicated that premise.
23 replies →
>> Blaming LLM hallucinations on the programming language?
My favorite was suggesting that people select the programming language based of which ones LLMs are best at. People who need an LLM to write code might do that, but no experienced developer would. There are too many other legitimate considerations.
14 replies →
I see no straw men in his arguments: what i see are pretty much daily direct quotes pasted in from HN comments.
1 reply →
I am squarely in the bucket of AI skeptic—an old-school, code-craftsman type of personality, exactly the type of persona this article is framed again, and yet my read is nothing like yours. I believe he's hitting these talking points to be comprehensive, but with nothing approaching the importance and weightiness you are implying. For example:
> Claiming that IP rights shouldn’t matter because some developers pirate TV shows?
I didn't see him claiming that IP rights shouldn't matter, but rather that IP rights don't matter in the face of this type of progress, they never have since the industrial revolution. It's hypocritical (and ultimately ineffectual) for software people to get up on a high horse about that now just to protect their own jobs.
And lest you think he is an amoral capitalist, note the opening statement of the section: "Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists in ways that might be hard to appreciate if you don’t work in the arts.", indicating that he does understand and empathize with the most material of harms that the AI revolution is bringing. Software engineers aren't on that same spectrum because the vast majority of programming is not artisinal creative work, it's about precise automation of something as cheaply as possible.
Or this one:
> Blaming LLM hallucinations on the programming language?
Was he "blaming"? Or was he just pointing out that LLMs are better at some languages than others? He even says:
> People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough!
Which seems very truthy and in no way is blaming LLMs. Your interpretation is taking a some kind of logical / ethical leap that is not present in the text (as far as I can tell).
1 reply →
Not a programmer now but my real takeaway from working with developers, enterprises at various stages of adoption: AI / LLM are not magic. The hype still exists but basically AI/LLM are tools. How we use these tools is what makes the enterprises different. unfortunately the quantity hype is '+' by big boys from MS, Goog adding to the noise. Personally, I love to see the quality and most importantly some real examples going beyond coding assistance. Be it automating the QA/QC or even some of the infra
One other thing that might feed into skepticism is that, for the average person who does not actively choose to use LLMs, the most regular LLM interaction is likely to be with AI Overviews in Google search, which are generally a pretty unfortunate experience. For example my last search with an AI Overview, for interpretation of blood pressure reading, resulted in the output: "Systolic pressure (top number): 123 mmHg, which is within the normal range of less than 120 mmHg".
The fact that LLMs can generate this kind of output is not incompatible with the OP's argument about how useful they can be when combined with agents and code review. But search is the lowest-friction way for anyone to interact with LLMs currently, and right now it's a pretty poor advertisement.
I find the Konwinski Prize to be very interesting in this context. 1 million dollars to whoever's open source LLM solves >90% of a set of novel Github issues.
https://www.kaggle.com/competitions/konwinski-prize/
Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.
And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.
Am I misunderstanding or are the models also limited to those that can be run with less than 96 gigs of VRAM?
The models that are both open source and quantized so that they can fit within that much memory are going to be significantly less capable than full scale frontier closed source models, I wonder how the latter would perform.
> Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful.
The best intellisense and code completion tools would solve 0.00. Those were the only tools we were using just a couple of years ago. 0.09 is a tremendous jump and the improvements will accelerate!
Assuming acceleration or even continued improvement is pretty naive.
4 replies →
They’re tab-completion with extra cognitive-load steps.
I mean, if you can solve 9% of Github issues automatically that's a fairly huge load of work you can automate. Then again you'd have to manually identify which 9% of issues.
2 replies →
I tried to take this article seriously, but it seems more like HN ragebait than an actual attempt at discussion. The engagement bait appears to be working, given all of the comments.
It’s like half of the arguments are designed as engagement bait with logical consistency being a distant concern:
> If hallucination matters to you, your programming language has let you down.
This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.
The thing is, the author clearly must know this. Anyone who uses LLMs knows this. So why put such a bizarre claim in the article other than as engagement bait to make readers angry?
There are numerous other bizarre claims throughout the article, like waving away the IP rights argument because some programmers pirate TV shows? It’s all so bizarre.
I guess I shouldn’t be surprised to scroll to the bottom and see that the author is a HN comment section veteran, because this entire article feels like it started as a reasonable discussion point and then got twisted into Hacker News engagement bait for the company blog. And it’s working well, judging by the engagement counts.
> This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.
I think the author's point is your language (and more generally the tooling around it) should make this obvious, and almost all the AI agents these days will minimally run linting tools and clean up lints (which would include methods and library imports that don't exist) if they don't actively attempt to compile and test the code they've written. So you as the end user should (almost) never be seeing these made up functions.
agree - it is written like clickbait or worse like a sponsored piece.
> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.
really ? what is the author smoking to consider it a solved problem ? This statement alone invalidates the entire article in its casual irreverence for the truth.
I use copilot everyday, and I know where it shines. Please dont try to sell it to me with false advertising.
The article specifically says it's not talking about copilot, but talking about agents that verify the code compiles before they show it to you.
If it uses a function, then you can be sure that function is real.
Was this not clear? The explanation I'm paraphrasing is right in between the line Aurornis quoted and the line you quoted. Except for the crack at copilot that's up at the top.
2 replies →
> Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.
Only if you are familiar with the project/code. If not, you were throw into a foreign codebase and have no idea how to tweak it.
And potentially make incredibly risky mistakes while the AI assures you it’s fine.
This is one of the most obnoxious things I’ve ever read. “I’m right and every vague straw man I can think of is stupid and naive.” Great. Why did you write this?
I was skeptical at first, but after trying to use it several times for work (Cursor/Claude) I concluded that it's worse than I thought. Lot more time wasted for little to no results. Maybe for boilerplate frontend code or TODO app generation it's good or something?
Anyway in a couple of months I'll try again. Until that time, I guess the search result summaries are nice?
> "LLMs can’t write Rust"
This really doesn't accord with my own experience. Using claude-code (esp. with opus 4) and codex (with o3) I've written lots of good Rust code. I've actually found Rust helps the AI-pair-programming experience because the agent gets such good, detailed feedback from the compiler that it can iterate very quickly and effectively.
Can it set up great architecture for a large, complex project from scratch? No, not yet. It can't do that in Ruby or Typescript either (though it might trick you by quickly getting something that kinda works in those languages). It think that will be a higher bar because of how Rust front-loads a lot of hard work, but I expect continuing improvement.
I agree, the latest models are not bad at Rust. Most issues I have when writing LLM-assisted Rust code are related to dependencies.
There are a lot of v0.x.x crates in Rust with frequent updates and breaking changes. As LLMs "learn" from code in the wild, they are exposed to different way to use the same crate and thus fail to produce working code "from memory".
But as soon as you explain how to use it with the latest syntax, they are able to fix the code.
These systems do work, quite well. Especially the latest release of Claude. Palantir is using AI to process your IRS information without your consent, https://archive.is/QAY9E. This should concern everybody. Not only can this information be used to determine specific account details. They can use this to target individuals which don't align with the current administration's agenda, as well exclude hiring base of political alignment. They could also use this data to exclude and discriminate those against Peter Theil's network state agenda aka "Freedom Cities", but it seems everybody is complacent with that.
You know what, I can use the word “fuck” too, just like this pro coder does: I’m fucking tired of these fucking nuts who act as if there is no problem at all having AI mess with my code. It harms my relationship with my code! It reduces my agency. It threatens my accountability.
He implies that if I am talking about some other flavor of AI than he personally uses than I am just not serious— but that’s what they said six months ago and two years ago. These fucking boosters have already blown their credibility.
If you want to prove that AI agents writing code for you is safe then perform a large scale test and show your results. We’re not crazy for doubting you. You are crazy for not doubting yourself.
> pull in arbitrary code from the tree, or from other trees online, into their context windows,
I guess this presupposes that it is ok for 3rd parties to slurp up your codebase? And possibly (I guess it ostensibly depends on what plan you are on?) using that source code for further training (and generating that same code for others)?
I imagine in some domains this would not be ok, but in others is not an issue.
> As a mid-late career coder, I’ve come to appreciate mediocrity. You should be so lucky as to have it flowing almost effortlessly from a tap.
If the kind of work this person is doing requires no thought, then of course an LLM will be perfect for it.
> I’ve come to appreciate mediocrity
I don’t care how long I can be in my career, I just can’t do BE like this. It feels off character for me to be mediocre.
It's so all-or-nothing this debate. If you're drawing a benefit from using AI tools, great. If you aren't, then maybe don't use them, or try some other approach to using them.
Personally I find AI coding tools situationally useful. I certainly wouldn't use them to write all my code, but I also think I'd be a fool not to leverage them at all.
Sometimes i feel like people who really like AI have a very different experience programming then i do.
They are constantly talking about AI doing all the tedious boilerplate bullshit. Don't get me wrong, some of my code is that too and its not fun. However the pro-AI people talk as if 80% of your day is dealing with that. For me its simply a rare enough occurence that the value proposition isn't that big. If that is the killer app of AI, it just doesn't sound that exciting to me.
When I see someone talk about the reams of boilerplate they’re getting the LLM to write for them, I really do wonder what godawful sounding tools and tech-stack they’re being subjected to.
Exactly. Back in the day, people talked about "design patterns". It took a while for (some of) the industry to recognize that "design patterns" are a sign that your libraries and tools aren't good enough, because you're having to write the same patterns repeatedly.
Anything where you’re doing basic CRUD apps. Yes there are generators, but not for everything. For me that’s where LLMs have been the most useful.
Unpopular opinion, boilerplate is good for you. It's a warmup before a marathon. Writing it can be contemplative and zen-like and allows you to consider the shape of the future.
Strongly agreed. And, your wording is excellent.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
This kind of guilt-by-association play might be the most common fallacy in internet discourse. None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates? How is that a reasonable argument to make?
The most obvious problem with this is it's a faulty generalization. Many of us aren't building large-scale piracy sites of any sort. Many of us aren't bulk downloading media of any kind. The author has no clue whether the individual humans making the IP argument against AI are engaged in piracy, so this is an extremely weak way to reject that line of argument.
The second huge problem with this argument is that it assumes that support for IP rights is a blanket yes/no question, which it's obviously not. I can believe fervently that SciHub is a public good and Elsevier is evil and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code. Indeed, these two ideas will probably occur together more often than not because they're both founded in the idea that IP law should be used to protect individuals from corporations rather than the other way around.
The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.
> The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.
Agree 100%. And generally programmers have a poor understanding of the law, especially common law as it applies in America (the country whose legal system most software licenses have been written to integrate with, especially copyleft principles).
American Common Law is an institution and continuity of practice dating back centuries. Everything written by jurists within that tradition, while highly technical, is nonetheless targeted at human readers who are expected to apply common sense and good faith in reading. Where programmers declare something in law insufficiently specified or technically a loophole, the answer is largely: this was written for humans to interpret using human reason, not for computers to compile using limited, literal algorithms.
Codes of law are not computer code and do not behave like computer code.
And following the latest AI boom, here is what the bust will look like:
1. Corporations and the state use AI models and tools in a collective attempt to obfuscate, diffuse, and avoid accountability. This responsibility two-step is happening now.
2. When bad things happen (e.g. a self-driving car kills someone, predictive algorithms result in discriminatory policy, vibe coding results in data leaks and/or cyberattacks), there will be litigation that follows the bad things.
3. The judges overseeing the litigation will not accept that AI has somehow magically diffused and obfuscated all liability out of existence. They will look at the parties at hand, look at relevant precedents, pick out accountable humans, and fine them or---if the bad is bad enough---throw them in cages.
4. Other companies will then look at the fines and the caged humans, and will roll back their AI tools in a panic while they re-discover the humans they need to make accountable, and in so doing fill those humans back in on all the details they pawned off on AI tools.
The AI tools will survive, but in a role that is circumscribed by human accountability. This is how common law has worked for centuries. Most of the strange technicalities of our legal system are in fact immune reactions to attempts made by humans across the centuries to avoid accountability or exploit the system. The law may not be fast, but it will grow an immune response to AI tools and life will go on.
I agreed with this comment until the second half which is just one scenario - one that is contingent on many things happening in specific ways.
In other words: this will probably all end in tears.
It's not just "guilt-by-association". It is a much worse reactionary general argument. It can be applied to any kind of moral problem to preserve the status quo.
If this was a legitimate moral argument, we'd never make any social progress.
That whole section seems so out of place. I don't know why he thinks "The median dev thinks Star Wars and Daft Punk are a public commons" either. I don't know why he thinks the entire software engineering profession is about enabling piracy. I suspect Netflix has more software engineers doing the opposite than every piracy service employs combined.
It's not just lazy, it's nonsense. The author is conflating piracy with plagiarism, even though the two are completely different issues.
Plagiarism is taking somebody else's work and claiming that you yourself created it. It is a form of deception, depriving another of credit while selling their accomplishments as your own.
Piracy on the other hand is the violation of a person's monopoly rights on distributing certain works. This may damage said person's livelihood, but the authorship remains clear.
I’m a free software developer and have been for over 25 years. I’ve worked at many of the usual places too and I enjoy and appreciate the different licenses used for software.
I’m also a filmmaker and married to a visual artist.
I don’t touch this stuff at all. It’s all AI slop to me. I don’t want to see it, I don’t want to work with it or use it.
Some people make these kinds of claims for ethical reasons, I get it. But be careful to not confuse one’s ethics with the current state of capability, which changes rapidly. Most people have a tendency to rationalize, and we have to constantly battle it.
Without knowing the commenter above, I’ll say this: don’t assume an individual boycott is necessarily effective. If one is motivated by ethics, I think it is morally required to find effective ways to engage to shape and nudge the future. It is important to know what you’re fighting for (and against). IP protection? Human dignity through work? Agency to effect one’s life? Other aspects? All are important.
9 replies →
> and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code.
I think I missed a story? Is GitHub somehow stealing my code if I publish it there under GPL or similar? Or did they steal some specific bit of code in the past?
Copilot was trained on all public code on GitHub and in the early days it could be made to actually vomit code that was identical to its training data. They've added some safeguards to protect against the latter, but a lot of people are still sore at the idea that Copilot trained on the data in the first place.
if your code is on Github it was/is being used as training data
> None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates?
I don't think that is an accurate representation of the tech community. On the other hand, I do think TFA is making a reasonable statistical representation of the tech community (rather than a "guilt-by-association" play) which could be rephrased as:
The overriding ethos in HN and tech communities has clearly been on the "information wants to be free" side. See: the widespread support of open source and, as your comment itself mentions, copyleft. Copyleft, in particular, is famously based on a subversion of intellectual property (cf "judo throw") to achieve an "information wants to be free" philosophy.
Unsurprisingly, this has also manifested countless times as condoning media piracy. Even today a very common sentiment is, "oh there are too many streaming services, where's my pirate hat yarrrr!"
Conversely, comments opposing media piracy are a vanishingly tiny, often downvoted, minority. As such, statistically speaking, TFA's evaluation of our communities seems to be spot on.
And, now the same communities are in an uproar when their information "wants to be free". The irony is definitely rich.
First, I don't agree that what you just said is at all reflective of what TFA actually wrote. Yours makes it about statistics not individuals. Statistical groups don't have an ass to shove anything up, so TFA pretty clearly was imagining specific people who hold a conflicting belief.
And for that reason, I think your version exposes the flaw even more thoroughly: you can't reasonably merge a data set of stats on people's opinions on AI with a data set of stats on people's opinions on IP in the way that you're proposing.
To throw out random numbers as an example of the flaw: If 55% of people on HN believe that IP protection for media should not exist and 55% believe that GitHub stole code, it's entirely possible that TFA's condemnation only applies to 10% of the total HN population that holds the supposedly conflicting belief even though HN "statistically" believes both things.
And that's before we get into the question of whether there's actually a conflict (there's not) and the question of whether anyone is accurately measuring the sentiment of the median HN user by dropping into various threads populated by what are often totally disjoint sets of users.
3 replies →
This guy may be right about a lot of things said here but he's smug and it's off-putting. preaching to the choir.
I keep talking to people who've had a good bit of success using gemini or cluade to build quick prototype front ends for some applications. I think theres some questions in my head of how well the process scales when you want to keep adding features, but according to them it's not been hard getting it to make the needed changes.
My issue with it is that it gates software development behind paid services with various levels of context supported. Absolutely not the dream I have of how more software should be open source and everyone should be empowered to make the changes they need.
> Think of anything you wanted to build but didn’t. You tried to home in on some > first steps. If you’d been in the limerent phase of a new programming > language, you’d have started writing. But you weren’t, so you put it off, for a > day, a year, or your whole career.
> I can feel my blood pressure rising thinking of all the bookkeeping and Googling and dependency drama > of a new project. An LLM can be instructed to just figure all that shit out. > Often, it will drop you precisely at that golden moment where shit almost works, and development means > tweaking code and immediately seeing things work better. That dopamine hit is why I code.
This is me . Just yesterday I finished something I have been putting off for 2 years. I maintain 2 mobile phones one because of work and i needed a way to see sms messages from the other phone. I wanted to write a simple andoird app that reads incoming messages and saves them somewhere for me to use for OTP's and other stuff. For the longest time I wanted to build something like this for myself but I have been putting it off just because of all the weird issues i had been getting with mobile development. A couple of days ago since I had some time and I fired up gemini and asked it to write me a simple flutter app for and to save the messages to supabase and voila it was amazing . With some help and some tweaking I have a fully functioning mobile app on andoird , a small tauri application on desktop to read the messages nicely formatted in a data table without much coding on my part at all. Truly remarkable.
I was a bit skeptical at first too, but once I let AI dive into my code, run tests, refactor functions, and even catch its own bugs, it felt like I suddenly had a really competent teammate. AI has been genuinely helpful for me , it saves time and lets me focus on the things that actually matter.
> lets me focus on the things that actually matter.
This reads right out of the PR release for these AI companies and their agentic coding solutions.
The author posits that people don't like using LLMs with Rust because LLMs aren't good with Rust. Then people would migrate towards languages that do will with LLMs. However, if that were true, then Julia would be more popular since LLMs do very well with it: https://www.stochasticlifestyle.com/chatgpt-performs-better-...
Does the linked study actually check that the LLM solves the task correctly, or just that the code runs and terminates without errors? I'm bad at reading, but the paper feels like it's saying the latter, which doesn't seem that useful.
I mean, just to steelman the argument, the "market" hasn't had time to react to what LLMs are good at, so your rebuttal falls flat. I think the original statement is more a prediction than a statement of current affairs.
Also, the author didn't say that "ease of use with LLMs" is the _only_ factor that matters. Julia could have other things wrong with it that prevent it from being adopted.
> but the plagiarism […] Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Speeding is quite common too, yet if you get caught — especially overdoing it — you'll have a problem.
Also, in this case, presumably everything produced with AI is fair game too? The argument being made here isn't even "it's not plagiarism", rather "it's plagiarism but I don't care" — why would anyone else respect such an author's copyrights?
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Loved this style of writing in 2005 from Maddox on the best site in the universe or whatever.
Sorry if I don't want google and openAI stealing my or anyone else's work.
The problem is not about quality of the code, it simply almost never works. Just tried some models still in beta on the code of one of the C projects and, while some pieces were okay and could be used as is, some are complete garbage and wouldn't even compile. Maybe few years later it will not become a problem but right now they are useless. For the AI-assisted research and internet search, that's a different thing and works pretty well already.
I’m happy to see evidence to the contrary. Does anyone have a video of someone interacting with a LLM and it producing half decent code?
I’ve yet to witness this myself but that’s with my interactions. Which are limited to free offerings.
"If hallucination matters to you, your programming language has let you down."
But every time I try this,this happens. I've tried the same thing on multiple agents and they all make up some magic api that doesn't exist, even when I've used the right thing elsewhere in the code to do the same job with that function commented as such. Asking them to use that always fails.
This seems a massive flaw to me that is just hand waved away here.
“I work mostly in Go.”
That should have been the start of the essay.
New title: LLMs make Go programming feel good
I have been using agentic AI to help me get started writing an OpenGL-targeted game from scratch (no engine). I have almost no background experience with computer graphics code, but I understand most of the fundamentals pretty well and I have almost 13 years of software experience. It's just that the exact syntax as well as the various techniques used to address common problems are not in my arsenal yet.
My experience has been decent. I don't know that it has truly saved me much time but I can understand how it FEELS like it has. Because it's writing so much code (sometimes), it's hard to vet all of it and it can introduce subtle bugs based on faulty assumptions it made about different things. So, it will dump a lot of code at once, which will get me 90% of the way there, but I could spend an hour or two trying to nudge it to fix it to get it to 100%. And then I will probably still need to go back and reorganize it, or have it go back and reorganize it. And then sometimes it will make small adjustments to existing, committed code that will subtly break other things.
Something that has surprised me (in hindsight, it isn't surprising) is that sometimes when I feel like it misunderstood something or made a faulty assumption, it was actually me that had the misunderstanding or ignorance which is humbling at times and a good learning experience. It is also pretty good at bug hunting and DEFINITELY very good at writing unit tests.
I count myself as pretty lucky that this domain seems to be very well covered in training. Given the law of averages, most people's domains will probably be covered. I'm not sure how it would fare with a niche domain.
> which will get me 90% of the way there
This is roughly my experience as well. The AI is great at the first 90% of the work and actively counterproductive for the remaining 90%
And wait until there is 500 million of generated loc no one read and the product needs to evolve every week.
2 replies →
This article feels incredibly defensive. If you have really have a technique that makes you 100x, 50x, or even just 2x more productive, you don’t need to write an article calling people who don’t agree with you nuts.
You keep using that tool, to your advantage. I’d you’re really altruistic you post some videos of how productive you can be like DHH did with his blog in 15 minute videos.
If you’re really that much more productive, the skeptics won’t be able to keep up and it should only take 6 months or some for that to become self evident.
I felt this article was a lot of strawman-ing.
Yes, there are people who think LLMs are just a fad, just like NFTs, and I agree these people are not really serious and that they are wrong. I think anyone who has used an AI coding agent recently knows that they are highly capable and can enhance productivity in the right hands.
But, as someone who gets a lot of value in AI coding agents, my issue is not with gen AI as a productivity enhancing tool - it's with the absolute torrent of BS about how AI is soon going to make coders obsolete, and the way AI has been shoved onto many engineering teams is like yet another incarnation of the latest management fad. My specific arguments:
1. As the author pretty much acknowledges, AI agents still basically suck at large, system-wide "thinking" and changes. And the way they work with their general "guess and check" method means they can churn out code that is kinda sorta right, but often leaves huge holes or outright laughable bugs.
2. Hallucinations are the worst possible failure modes - they look correct, which makes it all the more difficult to determine they're actually bullshit. I shudder to think about who will need to maintain the mountains of "vibe code" that is now being generated. Certainly not fucking me; I had a good career but I think now is definitely the right time to peace out.
3. Even if I could totally agree that there is a strong business case for AI, I can still, as an individual, think it makes my job generally shittier, and there is nothing wrong with having that opinion.
I don't think I'd be so anti-AI if I saw a rational, cautious debate about how it can enhance productivity. But all I see are folks with a vested interest overselling its capabilities and minimizing its downsides, and it just feels really tiresome.
I use LLM as a super Stackoverflow. So far it has been helpful. For code gen, 8 out of 10 cases, the code won't compile and need fixing. LLM did help a lot in learning new topics. Gemini did help tremendously recently in pointing out a new approach to a problem I was stumped on. Though the code it generated was garbage, the code did provide insight on a new approach which I adopted to make it work eventually.
I can’t even get copilot to autocomplete 5 working lines consistently. I spend hours every day arguing with ChatGPT about things it’s hallucinating. And Agents? It took me a year to convince anyone to buy me a copilot subscription. It’s not good enough now? But it was the bees knees just a year or two ago? See I hate the thing where the JS-framework tempo thing happens to the part of the software world I’m in.
His arguments are mostly right I would say. I tend to use a bit of LLMs for coding - just Copilot though - and I do like how it quite often saves me time with autocomplete spanning entire lines.
ChatGPT is also pretty usefull - especially for naming things.
However, on a global scale, I would compare the shift towards 'vibe coding' and massive LLMs use to the one we got when people doing the coding don't even know what Assembly even is. Current state of software is that it's incredibly bloated and slow (and I do love python). Doesn't matter that much since we've got massive improvements in hardware, but still. In 5 years, instead of not really realizing that we're writing slow and bloated code, we will not really realize we're writing incorrect code.
I don't know anything about web developement, but in the environements I've seen during my career - where most code is quite critical and a stupid mistake, one that an LLM could easily hallucinate, can cost a lot - I don't see it replacing humans anytime soon. It's a pretty good and usefull tool though.
I like LLMs, it's legitimately impressive to see people vibe code solutions (some times even to novel ideas). Agents are a much better use of LLMs than auto complete (which is terrible).
However I've tried coding agents, and well they suck... for me. Nearly all of the code I write these days is performance sensitive, as well as design sensitive (making abstractions for others to use), and not close to what could be considered boilerplate. The boilerplate I do write usually takes a trivial amount of time to write by hand. I'll keep trying whatever is new in AI every few months, I certainly don't want to be left behind, but so far that strategy hasn't left me optimistic.
> but the code is shitty, like that of a junior developer
It's worse than a junior. Juniors learn and grow, if they're a good junior it's often fairly rapid and self directed. Juniors can also ask for guidance or help, this is something maybe agents will be capable of in the future, but I could see it going pretty poorly for reasons tied to the first point about not learning. While LLMs contain a ton of information, more than a junior could, the junior is a better reasoning machine; I don't consider LLM "reasoning" to be anything close to the meatbag version.
> but the craft
Craft matters, it's important to enjoy and take pride in your work. If you enjoy managing AI agents that's fine, lots of programmers end up on the manager career path, but it's not for everyone.
I simply do not get this argument about LLMs writing tedious code or scaffolding. You don't need or want LLMs for that, you want libraries and frameworks.
I barely write any scaffolding code, because I use tools that setup the scaffolding for me.
If you're lucky to work in such an environment, more power to you. A lot of people have to deal with React where you need so much glue for basic tasks, and React isn't even the worst offender. Some boilerplate you can't wrap.
I use React at work, there is barely any boilerplate. I actually started a brand new project based on React recently and the initial setup before working on actual components was minutes.
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools.
I'll be damned if I give up control of my machine to a tool that hallucinates actions to take using hastily put together and likely AI-generated "agents". I still want to be the primary user of my machine, and if that means not using cutting edge tools invented in the last 6 months, so be it. I don't trust the vast majority of tools in this space anyway.
> I’m sure there are still environments where hallucination matters.
Still? The output being correct matters in _most_ environments, except maybe art and entertainment. It especially matters in programming, where a 99% correct program probably won't compile.
> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.
No, it's not. It's _the_ problem that's yet to be solved. And yet every AI company prefers chasing benchmarks, agents, or whatever the trend du jour is.
> I work mostly in Go. [...] LLMs kick ass generating it.
I also work mostly in Go. LLMs do an awful job generating it, just as with any other language. I've had the same shitty experience generating Go, as I've had generating JavaScript or HTML. I've heard this excuse that the language matters, and IME it's just not the case.
Sure, if you're working with an obscure and niche language for which there is less training data, I suppose that could be the case. But you're telling me that there is no good training data for Rust, the trendiest systems language of the past ~decade? C'mon. Comparing Rust to Brainfuck is comical.
I won't bother responding to all points in this article. I will say this: just as AI doomsayers and detractors deserve criticism, so does this over-the-top praising. Yes, LLMs are a great technology. But it is also part of a wildly overhyped market that will inevitably crash as we approach the trough of disillusionment. Their real value is somewhere in the middle.
Use of LLMs to program really doesn't make much sense. The very nature of computer software is that it's a small effort from a small number of people that is multiplied out to massive productivity gains for a large number of people. Unless the LLM is producing better output than the people would, you are saving time for a small number of people in exchange for degrading the experience of a much larger number of people. If your AI-coded app goes down for a day and you have to vibe-debug it, that is probably gonna put you in the red from a time perspective.
With the amount of supervision you need to give this stuff, I can't see it being useful for much more than auto-completing the next few lines of code as you type. You've got to read and go over the whole output carefully anyway, so you might as well have finer grained control over it.
What an arrogant jerk.
It sounds like someone taking a condescending tone to say that "Fast fashion is great! The clothes are really fine, and being able to wear a t-shirt once and then throwing it away instead of washing it is more convenient. This is most definitely progress. Sustainability? Hmm I can't say, that's not my field of expertise, therefore it's probably not important".
I just don’t like using AI tools. It’s absolutely boring. If you like that, fine.
Me? I like programming.
I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you. Maybe even the things you're using it for are not trivial or better served by a different tool. That's fine, I don't mind you using a tool far away from my codebase and dependency tree. It has not been useful for me, and it's very unlikely it's ever going to be.
Except that's not the argument people are making. They are arguing it will replace humans. They are arguing it will do research level mathematics. They are arguing this is the start of AGI. So if you want to put your head in the sand and ignore the greater message that is plastered everywhere then perhaps some self reflection is warranted.
You have to learn to filter out the people who say "it's going to replace human experts" and listen to the people who say "I'm a human expert and this stuff is useful to me in these ways".
> I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you.
Indeed but the tedious naysaying that this is arguing against is that AI isn't good full stop. They aren't saying "I tried it and it's not for me but I can see why other people would like it".
I've only just realized how bad AI is for me, in a manner I didn't at all expect. I don't use it. I don't want to use it. I hate it. As another commenter here nicely put it, I want to do, not tell. I like writing code. I like solving problems. I like doing the work myself. I like struggling and then, the next day, finding that my capabilities have grown.
And yet I'm overwhelmingly demoralized and depressed about AI's encroachment on software engineering, so much so that I am finding it difficult to marshal cognitive resources. I'm finding it harder and harder to engage with coding problems, even though I love programming.
The last thing I expected was that AI, even kept at a distance, would rob me of the joy I find in one of the things I love most.
It so devalues what I do and my basic capabilities as a human being that I kind of want to die. I'll stick around for my daughter, my partner, and any future children, but I'm not optimistic at all about the future.
It's enough that I consider peacing out. Not out of life, mind, but out to the woods on a permanent basis. I want zero proximity to the cynicism that runs rampant through the industry because it's so much more difficult to discern useful hype from slop than I could have ever imagined. The bastards are winning.
Stay strong. Someday we'll all meet at the opposite riverside, as in Fahrenheit 451.
I think another thing that comes out of not knowing the codebase is that you're mostly relegated to being a glorified tester.
Right now (for me) it's very frequent, depending on the type of project, but in the future it could be less frequent - but at some you've gotta test what you're rolling out. I guess you can use another AI to do that but I don't know...
Anyway, my current workflow is:
1. write detailed specs/prompt,
2. let agent loose,
3. pull down and test... usually something goes wrong.
3.1 converse with and ask agent to fix,
3.2 let agent loose again,
3.3 test again... if something goes wrong again:
3.3.1 ...
Sometimes the Agent gets lost in the fixes but now have a better idea what can go wrong and you can start over with a better initial prompt.
I haven't had a lot of success with pre-discussing (planning, PRDing) implementations, as in it worked, but not much better than directly trying to prompt what I want and takes a lot longer. But I'm not usually doing "normal" stuff as this is purely fun/exploratory side-project stuff and my asks are usually complicated but not complex if that makes sense.
I guess development is always a lot of testing, but this feels different. I click around but don't gain a lot of insight. It feels more shallow. I can write a new prompt and explain what's different but I haven't furthered my understanding much.
Also, not knowing the codebase, you might need a couple attempts at phrasing your ask just the right way. I probably had to ask my agent 5+ times, trying to explain in different ways how translate phone IMU yaw/pitch/roll into translations of the screen projection. Sometimes it's surprisingly hard to explain what you want to happen when you don't know the how it's implemented.
I think in a lot of cases the skeptics haven't really witnessed how good LLMs can be for a subset of software engineering tasks. I find for mid-complexity code they are far, far better than I am at coding. Even though I often tweak their solutions, they outweigh that by often identifying APIs and even whole libraries to call where I fully would have set about implementing it myself - either out of ignorance or an assumption that it would be too hard to implement it myself. They also put in much more error checking than I would normally do. Even if I do implement something myself, it's insane not to at least see what the LLM would do to get ideas. The thing is, once I do that, I nearly always come away with the conclusion that I might as well just use the code it made.
The primary skill a developer hones is the ability to know when something is already solved for you. This is where innate laziness is a gift. Why should I write a fetch function if api.fetch exists? This is a natural game time automatic decision, like a baseball player running for first base soon as the ball hits the bat. The natural thing that should be happening to every developer is the natural realization that the LLM is the ultimate api. I have literally sat here going “well why build anything, this damn thing does this and that for free”. For example, why build a todo app? Any of the LLMs can mostly do this. This is not self awareness, but game conditioning and fitness. You should automatically be feeling the sensation just like someone playing a sport for the millionth time.
Developers thinking AI is a fad is like a left fielder thinking about if they should stick their hand out to catch the ball or not. You can’t think about this.
Anyway, I find myself in stasis trying to actually see what to build given the reality of the power of this api. The South Park episode of “The Simpsons did it” is an apt episode. We are very much at the point where the continuous question will be “Well the LLM can do it”.
I read this article completely thru, twice, and I still have no idea what the author is trying to say. All I can read here is a bunch of sarcastic dismissals of stuff that I don't believe and have no way to evaluate.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
So then where do the junior developers come from? And then where do the senior developers come from?
As we all know by that time AI will make all senior developers immortal, rendering junior developers obsolete.
> but it’s bad at rust
I gave chatgpt (the free desktop web version) my Ruby solution to one of my favorite adventofcode challenges and asked it to write in in Rust. I've no experience in Rust except reading the few first chapters of the book but with the help of the compiler I managed to fix it and it gave the right responses 100x faster than Ruby did. So chatgpt did a great job keeping the logic intact.
I would have done the same thing if LLMs didn't exist, write it in Ruby first and then in Rust. The flexibility and expressiveness of Ruby allows to actually find a working solution. Experienced Rustaceans would do it directly in Rust, but writing in Ruby should be easier and an LLM can transform it.
What worries me about this is that the benefit of well practiced developers using AI to boost their productivity is there, but what happens to the new developers that piggyback off of LLM tooling too much and end up with worse understanding as a whole?
I think this has always been a problem; eg the students who do all their assignments in groups and coast through with a cursory understanding of their degree. Or those who cheat.
If someone doesn't want to learn, there have always been ways to avoid it. Now it's easier with LLMs, that's true. But for people who do want to learn, I think it's also easier now with LLMs.
I'm not sure where the equilibrium will land, in terms of whether we'll end up with more people with a deep understanding, or more people with a shallow understanding. Or maybe the ratio will end up being the same, but there will just be more people in both groups! Ie the ease of learning will pull some of the previously shallow learners into developing a deep understanding, and the ease of coasting will pull up people who were never into development before into being shallow learners.
Where are the results? I keep hearing about how great these coding agents are. What have you built beyond the complexity of a disposable toy app? Where are the swarms of new apps built by these agents that are solving new problems and taking over markets? I’m just not seeing results. I guess I need to spend some time using cursor as an agent to see it for myself.
I am a believer that these tools will be (if they aren’t already) extremely useful. Game changing in many ways, and I worry about the future of knowledge fields. Definitely not a skeptic in the trajectory. I do wonder if that makes those of us who are on the cutting edge more valuable as AI can quickly reproduce things that are well understood.
Please stop this. It’s clear to me that you’re not a software engineer. You weren’t one before AI, and you are not one after. You can do whatever you want, but please don’t push this AI narrative down everyones throat. I’ve had enough of this.
Does no-one think ahead, for more than a moment?
> Professional software developers are in the business of solving practical problems for people with code.
Are we? Is that our job? You might think it is, and it might even feel like it. But it is not. Our job, in 99% of cases, is to make stock price go up. That means making software that people will buy, making people want to keep buying it, and doing whatever makes your investors happy.
So what happens as you use these tools? Your job changes from coder to "LLM manager", but let's assume you're happy with that. You're more "productive", which is an artificial metric and nothing to do with your real job as described above. That's fine for now, but what happens down the road? Now your company is paying more and more money to these LLM services, and paying you less and less.
> "but they take-rr jerbs" So does open source.
This is a gross misunderstanding of open source. Most open source developers are paid to be, full time, by companies. Why? Warm fuzzy feeling? Goodness of their heart? No. It's because it's good for business. These open source techs are a baseline, there' not worth being competitive on. Build the open base in the direction you want, and build the "business value" ontop for profit.
> We used to pay good money for databases.
We still do. Have you looked at managed PostgreSQL prices lately? Also if you haven't noticed, Oracle hasn't gone anywhere.
> LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades.
This isn't the same thing as replacing punch cards, or moving away from Java 6, or replacing artisans with factory machines. This is centralisation and rent-seeking. As companies become more and more dependant on a handful of companies with the upfront capital to build massive server farms... what do you think is going to happen? Once they've convinced enough people that LLMs are essential to their work it's going to cost a lot more than $20/month. Once these companies are entrenched they will wring every last dollar out of every company that's grown to depend on them, many will simply fold.
Maybe we are in a seismic shift in our industry. But the beneficiaries are not going to be whole-hearted LLM adopters, no amount of saying "you need to change how you work" is going to save any of us. The only people that will benefit are going to be the people that own the data centres.
> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.
Translation: Drop your work and current tools every few months to pick up the shiny new thing or you aren't cool or even sane.
Reality: Your favorite cool new tool will be integrated into development work as it is proven, documented, and time allows. This idea that development must leap onto the latest shiny new thing every few months is extremely corrosive to reasonable business practice. The needs of the many outweigh the preferences of the lead developer.
Maybe it's only me, but I just don't write that much code. I try to change less than 100ish lines per day. I try to keep codebases small. I don't want to run a codebase with hundreds of thousands of lines of code in a production environment.
This post feels quite dishonest. Every concern is countered with some variation of ”you’re wrong and I am right”, zero nuance.
I use Claude for specific questions, it works great (but apparently I’m doing it wrong according to the author), code generation doesn’t, I don’t want my boilerplate code to have subtle differences in every file, that’s the worst kind of boilerplate and that’s what I get if I automate it.
There are real world implications of all the infrastructure being built for this too, not to mention the extremely short e-waste cycles during the boom we’re in right now. To me it also seems like LLMs are the wrong thing to solve for, it’s like a gasoline car, make the engine larger and more fuel consuming to go faster and no matter how much you perfect it, it will still be fossil fuel being burnt at a high rate. What’s the alternative, I don’t think we know yet, but if we were a smarter species we would see that this is not it.
I found this comment especially interesting: "The code in an agent that actually “does stuff” with code is not, itself, AI. This should reassure you. It’s surprisingly simple systems code, wired to ground truth about programming in the same way a Makefile is. You could write an effective coding agent in a weekend. Its strengths would have more to do with how you think about and structure builds and linting and test harnesses than with how advanced o3 or Sonnet have become."
I haven't used coding agents. Can someone who has (or knows about these things) elaborate, or point me to a good resource? Thanks!
It's a while loop. You collect a ton of context (file paths, function signatures, etc.), provide a bunch of functions to the LLM that it can ask to be called and an objective ("Implement X", "figure out X"). Then you send this to the LLM, perform the function calls it requests ("open file", "edit file", "call service X") and append their output to the payload and send it back to the LLM. Rinse and repeat until objective is complete (in which case the LLM may request to call a work_complete function/tool).
That's really it. Stuff the context, loop over the LLM until work is done.
Thank you!
I’m gaining more confidence AI is unable to take any of our jobs if this level of marketing is needed to try to convince us it will.
I can sum it up like this: if I could know in advance the exact right thing to build, producing the physical code, has not for a long time, been the bottleneck. I've been vibe coding long before it was cool. It's sometimes called model driven development.
For those that think only procedurally, I can see how it helps them. Because procedural first development has a lot of boilerplate logic.
For those who think model first, the AI may help them rubber duck, but ultimately the physical writing of the characters is minimal.
Most of my time is thinking about the data model. The AI writes almost all of my procedures against said data model. But that is about 20% speedup.
This: "Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine". I want to emphasize this. AI seems like a great 'first draft', but not a 'finished product'. In the hands of a programmer, a first draft is a great start. But pushing AI to frontline workers or customer support fails, as Klarna learned, because a first draft is not good enough.
When I ask chatGTP to generate the code for a M4F MCU that implements the VirtIO code for accessing the GPIO through RPMSG using FreeRTOS it produces two answers that are both incomplete and incorrect.
This is very outside my domain :P I asked ChatGPT to explain the acronyms in your comment and still do not understand it. But I think a reason LLMs might struggle is that there are too many up front complex requirements.
Giving it only 1 or 2 complex requirements at a time, and then having it iterate, I've found to be more effective.
Most LLMs don't "think", so when asking an LLM something, I generally try to think "would I be able to do this without thinking, if I had all the knowledge, but just had to start typing and go?".
You could break down your prompt into separate prompts like this maybe: https://chatgpt.com/share/683eb7d7-e7ec-8012-8b3b-e34d523dc9...
I think it broke things down in a weird way, but I definitely can't analyse the correctness of anything it outputs in this domain :P
Coding specific agents like Copilot might be better able to handle a complex initial prompt, since they take the initial prompt, and use LLMs to break it down into smaller steps, which ChatGPT doesn't do. They can sort of "think". Deep research AIs also have a sort of thinking too, so might do better.
I did follow your suggestion to break it down in separate prompts, but that to did not result in any useful code. I did not try to compile it, but I get the impression that it does not even compile. There is nothing remotely related to the VirtIO protocol in the code. See Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...
When I added the prompt: 'Could you use the specification given in Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...' it produced almost the same code preceded with some babbling from the document, but not using anything from the specification, not even the code fragments mentioned in the section.
1 reply →
I was just giving it as an example of how there are domains where AI assistance have little or no benefit. Even if you used them for searching for resources or instructions how to achieve something, they come up with nothing useful, or lead you in the direction you do not want to go. It is just like searching with Google: If you search for the thing that 90% of the people are searching for, your answer will be in the first page. If you search for something outside that, you have to skip many pages, or even not being able to find it at all.
One could also conclude that a large portion of software engineering is rather is mostly implementation of things that have been implemented many times before and that only a small portion consist of real software engineering where you have to develop code for a problem that nobody ever wrote before or that require a deep understanding of the problem domain.
1 reply →
I wonder if "lack of data" problem is solvable within LLM. Humans can apply general programming language knowledge when using a new language, but LLMs have to have lots of concrete source code to work well.
Today these models fall apart very quickly once you step out of js/python/c land to rust/zig/ocaml.
With today LLM tech, feeding it language documentation does not work great.
Not a very bright move from fly to publish this kind of condescending post on the company blog.
Making fun of people who care about copyright (implying that fly abuses GPL for instance) is just icing on the cake.
What this boils down to is an argument for slop. Yeah, who cares about the quality, the mediocrity, the craft... get the slop, push it in, call it done. It mostly works in the golden path, it's about 6 or 7 orders of magnitude slower than hand-written software but that's ok, just buy more AWS resources, bill the client, whatever.
I can maybe even see that point in some niches, like outsourcing or contracting where you really can't be bothered to care about what you leave behind after the contract is done but holy shit, this is how we end up with slow and buggy crap that no one can maintain.
>Yeah, who cares about the quality, the mediocrity, the craft..
Just about no-one in the F100 unless they are on very special teams.
If you care about the craft you're pushed out for some that drops out 10x LOC a day because your management has no ability to measure what good software is. Extra bonus points for including 4GB of node_modules in your application.
It's not much different without the AI. Managers don't care about efficient code, they care about code that meets the business goals - whether that's good or bad is debatable. Agencies duct-taping together throwaway code isn't new. The classic "just buy more AWS resources" & such have been around for quite a while.
Without AI, software is getting worse every year. My question is: will AI resolves that problem, or will it make it worse?
My feeling is that AI will keep the software industry in the same trend: produce more software of lower quality while being more profitable.
People sometimes say "perfect is the enemy of good". But this implies that putting less effort is "good enough". I believe it goes the other way: why would a company put efforts into selling you something good, if it can make you pay the same for something worse? It's all about maximising profit, and AI is no exception there.
"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other."
This is in fact what I am doing (though I am not trying to convince anyone not to use LLMs). How can I get into this other mode? Any pointers? Do I need to pay up for a service? I do not want to carry out vibe coding, I want to use LLMs to solve physics, math and logic problems/puzzles. I'd appreciate any recommendations.
Just have a look at Cursor.ai or zed.dev. Cursor is based on VS Code and integrates different models right into your codebase. It has access to your codebase, knows your code (after indexing) and is able to directly apply changes to your code or run commands on your integrated terminal. You can directly accept/decline changes. You can reference existing code / files. You can link documentation, you can write a default prompt and default settings (previously .cursorrules).
It's faster than opening chatgpt in browser, writing your promt, waiting for the reply, scanning and scrolling through the reply, copying it, inserting it manually, going back to say something doesn't work.
I wonder if it comes down to there being many kinds of programming and this stuff is good for some but not all.
The author describes a world where most code is boring grunt work and most collaborators are mediocre. This world is not familiar to me. I can see LLMs being useful in such a world, but it’s not where I work.
The comment about buying a table instead of building it really stuck with me. We’re the ones who design the table! Saying we should buy one from the store is like saying farmers should buy their food from the store instead of putting in all that work to grow it.
I've been using LLM's pretty regularly for just over a year (my PyCharm AI license just renewed). My experience before that was pretty mixed and fell into the "sort of worth it" category.
It's improved to the point where the summary that I would email a junior developer is a pretty good prompt that produces code pretty close to want I want.
I had a weird project recently where I needed a stand alone event handler that would send an email receipt to the customer, another email to customer service, log the event in an accounting system via an API. I wrote up a proposal for the rest of the management team and, once accepted, pasted the proposal into ChatGPT (and a few others, but ChatGPT gave the best results).
I basically got a full recipe for setting up AWS API Gateway, Lambda, and SES, the Python Lambda function with api calls to SES, suggestions on using AWS SAM, etc. etc. Tweaked the prompt a couple of times and then took over from there. I had this into the hands of the testing team in less than a day.
I'm pretty sure this would reduce my team size in ANY role I've had in my career. Maybe not halve the teams, but it would eliminate a lot of the junior or journeymen developers.
LLM fanatics throw me nuts, instead of learned helplessness, at least try to understand what the token predictor outputs and be honest when you used LLM to generate your code dump when asking for help. Otherwise you are just wasting time and being simply a middle man for a machine and person who expects an interaction with actual human being with their valuable time.
A well articulated blog, imo. Touches on all the points I see argued about on LinkedIn all the time.
I think leveling things out at the beginning is important. For instance, I recently talked to a senior engineer who said "using AI to write programming is so useless", but then said they'd never heard of Cursor. Which is fine - but I so often see strong vocal stances against using AI tools but then referring to early Copilot days or just ChatGPT as their experience, and the game has changed so much since then.
At 0 temperature an LLM is a Map<String,String> - a string input (key) will give you the same string output (value) every time. Hypothesis: there exists a key whose value is a complete, working, fully-tested application which meets your requirements 100% and fulfills your business need. This key is the smallest, most complete description of what your application does. It is written in natural language and represents a significantly compressed version of your application code.
My part-time obsession over the last few months has been trying to demonstrate this and come up with a method for finding these magic keys (I even tried to get the LLMs to search for me, lol). What I really want is to give the latest thinking models (200k input, 100k output) a 5-6 page design doc (4k words, 5k tokens) and have them produce a complete 5kloc (50k tokens) microservice, which would show a 10x compression. It's hard, but I haven't seen any reason to think it wouldn't work.
For better or worse, I think this will be close to what IC jobs will be like in few years. Fundamentally, our jobs are to try work with other functions to agree to some system that needs to exist, then we talk to the computers to actually implement this. If we switch kotlin+compiler for design doc+llm, it still going to be somewhat the same, but far more productive. Agents and such are somewhat of a stop-gap measure, you don't want people giving tasks to machines, you want to accurately describe some idea and then let the computers make it work. You can change your description and they can also figure out their own tasks to evolve the implementation.
Damn. Well I'll spend a few bucks trying it out and I'll ask my employer if they're okay with me using agents on company time, but
But I'm not thrilled about centralized, paid tools. I came into software during a huge FOSS boom. Like a huge do it yourself, host it yourself, Publish Own Site, Syndicate Elsewhere, all the power to all the people, borderline anarchist communist boom.
I don't want it to be like other industries where you have to buy a dog shit EMR and buy a dog shit CAD license and buy a dog shit tax prep license.
Maybe I lived through the whale fall and Moloch is catching us. I just don't like it. I rage against dying lights as a hobby.
Yeah, I'm ready to jump in, but I need an agent running on my hardware at home without internet access.
How far away are we from that? How many RYX 50s do I need?
This is a serious question btw.
It's unfortunate that AMD isn't in on the AI stuff, because they are releasing a 96GB card ($10k so it's pricey currently) which would drop the number you need.
I mean it depends on the model; some people running deepseek report they have better performance at home running on a CPU with lots of ram (think a few hundred gigabytes). Even when running locally vram is more relevant than the performance of the GPU. That said I'm really not the person to ask about this, as I don't have AI agents running amuck on my machine yet
You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.
An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)
DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.
Everything works right until it doesn't. LLMs are trained on things that have worked. Let's revisit in 2027 when things are insanely faster, but not much better.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
And who the fuck will train those interns, now they've been replaced by machines?
I think I figured out a big problem I have with AI, and that is that I just hate using other people's software. The vast majority of software is just garbage. It all tries to take way too much of my attention. It doesn't help that all these AI tools are new, so they are inextricably linked to some SV startup that wants to be the next unicorn, so they have some bullshit account-generation/data-retention/terms-of-service/acceptable-use-policy/remarketing-email-campaing song and dance that is all stuff I have to pay attention to. All I really want to do is sit down and think on my own. I'm constantly inundated with distractions at work, so when I get a few minutes to myself, I sure as hell don't want an AI bot messaging me about reviewing its PR. I don't even want to review the human-written PRs; just push your code already, if it breaks you get to fix it. I certainly don't want a 500th unread email in my inbox.
You know what doesn't constantly nag for my attention? Dead text files on disk. They just sit there, waiting for me to grace them with my presence. That's the way it should be. Computers serve me, not the other way around.
So the reaso we should not use Rust (a memory safe language that gets rid of categories of bugs) is because some new AI tools are not read?
Using AI for over a year now on a daily basis, most AI tools do not struggle with Rust, they struggle to provide value. If I do not instruct them to keep the output to the bare minimum, I usually end up with 10x the output what would be a simple solution to my problem.
I think for most people that do not get value out of AI assisted coding, its a skill or learning issue. I don't care how good of a developer you consider yourself to be. If in 2025, these tools do not provide you significant (to almost absurd) value, you just do not know how to use them, or are to stubborn to push past the initial learning curve. Alternatively, maybe... programming is changing so much with AI, that you just might not see what's happening right infront of your eyes because its so foreign to all the other paradigms in the past.
In 2025 I am about 800-900 hours deep into using Cursor, blowing through 500 credits each 2 days (500 credits = 20$). I also organize a local "vibe code" workshop and meetup, with people from all sorts of experience levels, from no coding experience, to twice my senior. And what I am seeing is that what separates those from those who are good at using AI and those who are not, generally is a level of rigid stubbornness.
I can't wait for time to do its course here and those people who now are "all nuts" to slowly see programming of the past become an old relic.
The premise of “lol, I’m just a dumb developer and just wait until the smart developers catch on” falls extremely flat coming from Thomas.
It kinda invalidates the entire premise because it builds on this false foundation of “look how useful it is to a mere plebeian” and the skills actually required to sift through LLM garbage are what puts you in the top 5% of programmers.
Most senior developers cannot conduct useful code review and spot errors. In every company I’ve been at, there were only 1 in 20 people that would give the useful in-depth reviews required to pick up LLM bullshit.
This style of engagement bait is brutal and it severely overshadows other nuggets of wisdom of how LLMs can be useful.
There’s a fine line between being an AGI skeptic and an AI skeptic.
While I don’t believe the “AI will take all jobs” VC tales, I also think not using LLMs in software development workflows is similar to not wanting to use Excel as an accountant in the 1990s.
AI won’t replace all jobs, but companies that cleverly integrate it into their processes will be faster and more efficient compared to those that don’t.
FWIW with proper MDC/ rules I’ve found LLM programming agents excellent at rust. There’s a lot of complex and tedious minutiae in rust that I know but forget to apply everywhere it’s helpful while a SOTA LLM agent does well, especially with proper rule guidance to remember to use it.
Generally though I find LLMs have a pretty rapidly diminishing return on what you can expect out of them. They’re like a 3-5 year senior programmer that has really learned their domain well, but doesn’t have the judgement of a principal engineer. You get to a point where you need to reach in and right things and really pay attention, and at that point the diminishing returns set it rapidly and you’re better off just doing the rest yourself. Refactors and stuff can be delegated but that’s about it.
I find this true regardless of the language. None the less, I’ve been able to improve my overall velocity dramatically completing several projects in the last few months in the span of one typically. If tooling improves I hope to continue that but I’m already getting close to the limit of how fast I can conceive of useful creative things.
It is going to be so much fun when all AI companies suddenly double prices, because what are you going to do, hire juniors?
Nah, I’m not nuts, you’re just terminally brain damaged by the amount of ketamine you’ve huffed thinking AI was such a great tool.
> but the craft - Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.
That is not my objection, though some seem to think it is.
I got into coding because I found it interesting and fun. I've avoided managing others because I hate that sort of thing. Working with an LLM, prompting it to do work then assessing that work and adjusting the prompts or cleaning/completing the output myself, is far too much like outsourcing work to a 3rd party junior, and I don't want to do that whether said junior is a human in a far off land (or just down the street, for all I know!) or some LLM run by MegaCorp-ᴙ-Us.
I was already seriously disliking much of the modern programming industry (remote teams, the shit-show that client-side web work continues to devolve into, and much else) anyway. If it gets to the point where I can't do my job (or can't do it competitively enough) without using LLMs on top of dealing with the other things I don't care for, I'll be retraining for something else entirely. Heck, I've got the mortgage paid, I don't have or plan to have a family beyond the cats, I could just about survive on shelf-stacking…
I know it isn't a passing fad, though I'm convinced the current and near-future iterations aren't going to be quite as significant in the longer term as many believe. I don't object to others using it (aside from the matter of “MegaCorp-ᴙ-Us”s of the world suddenly deciding that piracy is just fine because they stand to benefit from it, in fact are already benefitting from it, in this case). But I don't plan to be nagged into using it myself.
> Do it on your own time.
If/when that career shift happens, or I survive as I am long enough to retire, I will probably do exactly that. As it stands I spend more than enough of my life on tech in DayJob, my own time is increasingly filled with unrelated hobbies and people.
A lot of the doubters are basically saying: "If it's so great, show me proof." While that is usually the right stance, I think it is short-sighted in times of revolutionary change.
To be sure, you definitely should not blindly trust the people who have a stake in selling AI products. However, you probably should trust the people who have no stake in selling AI, but are using it to accelerate themselves. The problem is, these people are probably too busy building things to spare any time convincing you. And in fact, it is a competitive advantage for them if others don't figure it out.
Here's my take: Everyone needs to figure out how AI works for them. The thing with AI is that it is a technology that behaves like people. It has the speed, efficiency and scale of computers but the fallibility and quirks of humans. This is why it's so confusing to discuss, and why folks have such varying experiences. The right approach to working with AI is like working with people: understand what they're good at and where they are weak, and then work with them to achieve your goals.
This will require using it "in anger" on many non-trivial tasks over a significant period of time, which will take persistence and patience. Yes, the hype is so over-the-top that you might not want to invest your valuable time on it. But you owe it to yourself to ensure you can fully capitalize on the sea change that is happening.
If it helps, I (like the OP) have no stake in selling AI and I posted this comment about my experiences on the other thread about the AI coding -- https://news.ycombinator.com/item?id=44163543 -- The upshot is, AI has enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. And I've not even used agents yet!
> Everyone needs to figure out how AI works for them.
That is the real takeaway.
I use LLM’s for dozens of things. It took time to find how model strengths best connect with mine.
People who only evaluate models for one purpose over short timespans are going to have a hard time.
It’s like models come from a different cognitive & communication culture. Eliminating that barrier takes more experimentation and experience than logic.
I think their creativity is also often a function of the prompters creativity.
The harder you (creatively) push them into a creative corner, the more genuinely creative they get.
Not unlike how a good teacher can get unexpected positive results from students when they are channeled and pushed.
I believe that AI is very useful in software development but I don't buy the narrative that AI is responsible for layoffs over the past few years (at least not most of them). I think that narrative is a convenient cover for systemic misallocation which created a need to contain inflation. I think big tech execs understood that, beyond increasing their company stock prices, they also need to work together to keep the monetary system itself under control. This is why they've been firing people whilst having record profits. They've reached such scale and the system has reached such fragility that they have to think and act like economists to keep the thing going. The economy itself has become the responsibility of big tech.
But who knows, maybe AI will accelerate so rapidly that it will fix the economy. Maybe we'll have robots everywhere doing all the work. But I worry about the lack of market incentives for people to adapt AI to real world use cases.
For example, I'm an open source developer who likes to tinker but I've been booted out of the opportunity economy. I can't afford to program robots. People like me are too busy using AI to parse spreadsheets and send targeted ads to even think about automating stuff. We work for companies and have no autonomy in the markets.
If things had worked out differently for me, I'd probably own a farm now and I'd be programming robots to do my harvest and selling the robots or licensing the schematics (or maybe I'd have made them open source, if open source had worked out so well for me). I don't have access to such opportunity unfortunately. The developers who worked for big tech are good at politics but often disconnected from value-creation. Few of them have the skills or interest to do the work that needs to be done now... They will just continue leveraging system flaws to make money, so long as those flaws exist.
If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening. Control over intellectual property has given us fantastic things—Linux, Blender, etc.—and you can’t have GPL without an ability to defend copyleft. IP laws provide that ability.
One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly ethically corrupt. The ends do not justify the means. Just like refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed, everyone makes this decision for themselves.
I feel silly asking, but is this reality or hyperbole?:
> An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR.
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
I'm working better and harder with Roocode in Code mode (and somtimes Architect). I've only been doing it for a few weeks, and it reads to me like I'm already using an 'old approach'.
I so strongly agree with this.
I've never been a completionist. I'm distracted. I like novelty. I make half-form experiments, but mostly provide deeply immersive conversations for the benefit of a small group of people around me, some of whom go on to make/do things that I proudly feel my influence on. The final 10% of work often defeated me.
Now, I just make it happen. The pain is gone. Something else can carry the last 10%, and my ideas get realized by me (and AI).
I can't even conceive how this will change the world en masse. Knowledge domains (not just tech, but science, art, etc) have been built of the things that the type-A completionists knew how to see. The others didn't finish, and what they saw of the world has largely been marginalized.
The mind of ADHD and creative types is about to get much more represented in the output products of civilisation. This is going to be a big recalibration, and I think a re-balancing toward something more healthy tbh
Can we get a video of a workday conducted by these people?
Unless there's a significant sense of what people are working on, and how LLMs are helping -- there's no point engaging -- there's no detail here.
Sure, if your job is to turn out tweaks to a wordpress theme, presumably that's now 10x faster. If its to work on a new in-house electric motor in C for some machine, presumably that's almost entirely unaffected.
No doubt junior web programmers working on a task backlog, specifically designed for being easy for juniors, are loving LLMs.
I use LLMs all the time, but each non-trivial programming project that has to move out of draft-stage needs rewriting. In several cases, to such a degree that the LLM was a net impediment.
Not exactly what you're asking for, but https://news.ycombinator.com/item?id=44159166 from today is not a junior web programmer working through the backlog, and the commit history contains all the prompts.
Sure, thanks. I mean it's a typescript OAuth library, so perhaps we might say mid-level web programmer developing a library from scratch with excellent pre-existing references, and with a known good reference API to hand. I'd also count that as a good use case for an LLM.
6 replies →
I have done everything from architecture design for a DSP (Qualcomm), to training models that render photos on Pixel phones, to redoing Instagrams comments ranking system. I can't imaging doing anything without LLMs today, they would have made me much more productive at all of those things, whether it be Verilog, C++, python, ML, etc. I use them constantly now.
I use LLMs frequently also. But my point is, with respect to the scepticism from some engineers -- that we need to know what people are working on.
You list what look like quite greenfield projects, very self-contained, and very data science oriented. These are quite significantly uncharacteristic of software engineering in the large. They have nothing to do with interacting systems each with 100,000s lines of code.
Software engineers working on large systems (eg., many micro-services, data integration layers, etc.) are working on very different problems. Debugging a microservice system isn't something an LLM can do -- it has no ability, e.g., to trace a request through various apis from, eg., a front-end into a backend layer, into some db, to be transfered to some other db etc.
This was all common enough stuff for software engineers 20 years ago, and was part of some of my first jobs.
A very large amount of this pollyanna-LLM view, which isnt by jnr software engineers, is by data scientists who are extremely unfamiliar with software engineering.
16 replies →
I write embedded firmware for wireless mesh networks and satcom. Blend of Rust and C.
I spent ~4 months using Copilot last year for hobby projects, and it was a pretty disappointing experience. At its best, it was IntelliSense but slower. At its worst, it was trying to inject 30 lines of useless BS.
I only realized there was an "agent" in VS Code because they hijacked my ctrl+i shortcut in a recent update. You can't point it at a private API without doing some GitHub org-level nonsense. As far as my job is concerned, it's a non-feature until you can point it your own API without jumping through hoops.
You used one AI tool that was never more than autocomplete a year ago and you think you have a full hold of all that AI offers today? That's like reviewing thai food when you've only had Chinese food.
2 replies →
Here's a 3+h video of the PSPDFKit (Nutrient) founder vibe-coding a Mac app. Can be watched at 2x: https://steipete.me/posts/2025/the-future-of-vibe-coding?utm...
> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument.
Yes we are, because the kind of work you need to do in C++ or Rust is probably entirely different from the work this person manages to get the LLM to do in Go.
The most important thing in this article in my mind is in the level setting section - if you are basing your perspective on the state of AI from when you tested it 6mo+ ago, your perspective is likely not based on the current reality.
This is kind of a first though for any kind of technology. The speed of development and change here is unreal. Never before has a couple months of not being on top of things led to you being considered "out of date" on a tool. The problem is that this kind of speed requires not just context, but a cultural shift on the speed of updating that context. Humanity just isn't equipped to handle this rate of change.
Historically in tech, we'd often scoff at the lifecycle of other industries - Airlines haven't changed their software in 20 years?? Preposterous! For the vast majority of us though, we're the other industry now.
>pull in arbitrary code from the tree, or from other trees online, into their context windows, run standard Unix tools to navigate the tree and extract information, interact with Git, run existing tooling, like linters, formatters, and model checkers, and make essentially arbitrary tool calls (that you set up) through MCP.
....for the vast majority of my career, anyone who suggested doing this - much less letting code that no one in the world (much less the company) truly understands the logic flow of do this - would be fired.
> You’ve always been responsible for what you merge to main.
I've never merged anything with an unclear copyright to main, let alone if I positively suspected that it had been plagiarized.
> Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Every place I've been employed was nearly paranoid about copyright, with rules in place about never copying and pasting anything copyrighted into the company code base.
All the major open source distros and projects, likewise.
There are some bad apples out there, like, oh, Sam Altman, who wants massive plagiarism to be fair use.
So, au contraire, you can shove your contemptuous blog up your ass, buddy.
I know this is a website for programmers and I am not one. I like LLMs for quick answers and simple tasks like proofreading (just to fix errors, not to add or change anything) and simple text formating. I absolutely hate for everything else. I am a writer and LLMs prose may be correct and sometimes whimsical but it has soulless quality to it that makes me forget about what I just read after five seconds. My friends use Meta AI on our chat groups on WhatsAppp and I hate it so much, I delete every single AI message as soon as I see it.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Seriously? Is this argument in all earnestly "No profession has been more contemptuous therefore we should keep on keeping on"? Should we as an industry not bother to try and improve our ethics? Why don't we all just make munitions for a living and wash our hands of guilt because "the industry was always like this".
Seems a bit ironic against the backdrop of <https://news.ycombinator.com/user?id=tptacek>:
> All comments Copyright © 2010, 2011, 2012, 2013, 2015, 2018, 2023, 2031 Thomas H. Ptacek, All Rights Reserved.
(although perhaps this is tongue-in-cheek given the last year)
It's a fib sequence
> I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.
I'm glad I got out when I did. :) This simply isn't something I want to do for a living. I'm glad there are people who do, though.
Judging from your username, you are 10 years older than I am. What are you doing now? Early retirement? Or career change?
Went from 20 years of dev to teaching CS at a university--for as long as that lasts. Call it "semi-retirement". :)
After that, who knows. Maybe the whole system collapses and I just go to work as a tutor for some rich peoples' kids in exchange for loaves of bread.
This op ed suggests that it’s easier to audit a huge amount of code before merging it in than is to write the code from scratch. I don’t know about anyone else, but I generally find it easier to write exactly what I want than to mentally model what a huge volume of code I’ve never seen before will do?
(Especially if that code was spit out by an alien copypasta that is really good at sounding plausible with zero actual intelligence or intent?)
Like, if all I care about is: does it have enough unit tests and do they pass, then yeah I can audit that.
But if I was trying to solve truly novel problems like modeling proteins, optimizing travel routes, or new computer rendering techniques, I wouldn’t even know where to begin, it would take tons of arduous study to understand how the new project full of novel algorithms is going behave?
The author mentions that smart programmers kick off agents, then go for a coffee and wait to be notified.
But agents like Claude Code and Cursor run on a user’s machine right? So how can they work in the background like that? are there agents that can work async?
Fun article, although I'd say I'm maybe slightly more on the skeptical side.
I've found LLMs can't really write a whole program there isn't already a template for. They're good enough at writing individual functions, and as the author mentioned, their breadth of knowledge is better than any individual.
Honestly, I see them as a supercharged version of googling, without all the out of context answers you get from Google since they also place ads and whatnot. I find an LLM can get me an answer far quicker than a regular internet search, even if it's imperfect and I'm probably not going to put it straight into my code.
Finally, a proper good take from someone (seemingly) approaching it rationally and objectively, rather than blindly sitting on either side of the hype fence. LLMs are good tools, and even I, a decidedly-not-software-engineer, have been increasingly tinkering with locally-run tools to create applications and tooling that meets my own needs. The one thing that’s held me back from publishing has been the IPR angle, but I quite like this line on that topic:
> But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
The OP isn’t wrong. Some of the most brilliant minds in computer science have built tools to shatter DRM, enable widespread piracy, and repeatedly lift code wholesale from public repos if it helps them move a product forward. The modern crop of technologists (myself included) have long had a hypocritical view on IPR: “rights for me, not for thee” to oversimplify things. The entirety of the modern public cloud is built by trodding on the backs of open source projects while wearing lawn-aerating spikes. We steal shit from others so our Founders and Masters can reap billions in profit; to decry plagiarism now feels incredibly dishonest and insincere, at the very least.
Look, I’m an AI skeptic myself. On the whole, these tools are bad. They’re taking jobs, polluting the planet, and dumbing down a society that still can’t try rebooting something when it fails, let alone identify where their WiFi ends and the internet begins. They’re toxic walled gardens that can reauthor reality to the whims of the VCs and Billionaires funding them, locking you into subscriptions forever.
But even I will admit that modern LLMs, when it comes to writing code, are actually really good for the majority of C-tier devs and below. Are they as helpful to bleeding-edge engineers on the cusp of the future? Nah, not really, but if you’re just writing a checklist app for iOS to help you stay on task, or a Packer manifest to lay out some VM templates, they’re kinda awesome.
I think of programming languages as an interface between humans and computers. If anything, the industry expanded because of this abstraction. Not everyone has to learn assembly to build cool shit. To me AI is the next step in this abstraction where you don't need to learn programming languages to potentially build cool projects. The hard part of software engineering is scale anyways. My bet is that this will expand the industry in unprecedented ways. Will there be contraction of traditional programming jobs? Absolutely. The growth in tech jobs over the last 20 years weren't more assembly programmers. They were abstraction experts. I'm sure the next wave will be even bigger, professional prompting will explode in size.
The C abstracting the assembly or the GC a abstracting away memory management work because they were possible to implement in a deterministic and reliable way (well, in the case of garbage collection, not all the time)
But I don't think that's a similar situation for LLMs, where the hallucinations or failure to debug their own issues are way too frequent to just "vibe code"
He forgot "but the 15 year old behemoth codebase that doesn't fit in any context window and isn't well documented and changing of which requires changing tons of ancillary files not easily connected to each other logically".
That's what a lot of the highest paid programmers work on. I work in such a firm and the agent they have for us is incapable. And it's the latest and greatest from whoever coders love to rave about online.
I love that AI written tests and other generally greenfield things work well and can be vetted quicker than writing and getting the code oneself. But for most more complex tasks it's not there. I'm ready when it is though
"If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that."
Thus sucking up all the time you "saved" by having the LLM write the code you saved. The machine made the easy parts of coding easier while making the hard parts harder.
"You’ll spend 5-10 minutes knocking it back into your own style."
As if style is the only thing that matters?
"LLMs are showing signs of adapting to local idiom, but we’re not there yet."
First step fallacy. Ladders are showing signs of being tall enough to reach the moon, but we're not there yet.
Yeah, and it’s progressing so fast. Singularity is definitely on the table.
Whoever says otherwise should read their own comments from 2 years ago and see how wrong they were about where AI is today.
Not saying singularity will happen for sure, but is it a possibility? Hell yeah.
It's not really that different than 2 years ago. Better but not qualitatively so.
Tell me you don’t understand LLMs without telling me you don’t understand LLMs.
I definitely don’t understand LLMs. Kind of the point though.
[flagged]
Will LLMs ever break the “you have to treat it like an intern/junior” ceiling?
For some reason, every hardcore AI proponent I hear from resorts to gaslighting and namecalling to convince everyone it's so useful for everything.
When that doesn't work, they start inventing hyper-specific scenarios and graphs with grounding that exists only in their imagination to push the topic further. If the AI can't solve a problem in a domain for you, well, it's that domain that's the problem, not AI. You simply need to re-invent the paradigm around your entire problem - easy stuff.
Finally, when they've failed to convince anyone with a modicum of rationality, they resort to some version of "well, fuck you, it's the future and I'll just be leaving you behind!"
Interestingly, this article has all three.
I want to skip all the debate and just have someone tell me what setup I need to get to here:
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
LLMs are a tool, a very precious one if you know how to use it.
I've seen Antirez use it for hard to find bugs in the Redis codebase (videos are on YouTube, albeit Italian), I've seen Cursor refactor in a matter of minutes code that would've taken hours, correctly, I've seen security researchers leverage it to find complex edge cases, etc
My 2 cents is that developers that won't learn to use the tools will be left behind (albeit not for long imho), but also, that sales people promoting the idea that devs can be replaced equally dangerous.
>It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.
Are you kidding me? Mosaic was released in 1993. In 1998 Apple released the iMac on a theory, based on some marketing research, that people might be really interested in "getting online." Five years. And that's about a decade before we realized we might want a touch phone. More than a decade before an App Store.
This is very much still the top the first. AI will make even the internet seem small. Which makes sense — the entire whole internet is merely one input into this technology.
My current workflow with Codex is (coding environment from OpenAI):
(1) Ask to write an implementation plan for a specific change or a feature. It will go through the source code, look up references, make notes and produce a plan
(2) Review the plan. Point out missing things, or stuff that needs improvement.
(3) Once I'm satisfied with the plan - ask to draft PR. Launch a few attempts in parallel and pick the one that I like the most.
(4) While drafting PR, Codex will run unit tests (even can run E2E tests in its container), linting and type checkers at every single step. This helps a lot with the stability.
(5) I review the code and merge the PR if I like it. Ask to cleanup - if not.
This feels like working with a remote team - very patient and diligent at that.
Ultimately, I get to get more features done per day. But I also feel more tired by the end of the day due to a higher level of cognitive load. There are more decisions to make and less idle time (e.g. no more hours spent tidying up the code or doing relaxing and pretty refactoring).
TLDR; this AI thing works really well at least for me. But it comes with trade-offs that might slow down its adoption by companies en masse.
I do use LLMs for coding and the newer models have definitely been a blessing. I don't know about using coding agents (or agentic coding) though. I personally do not find this better than chatting with the llm, getting the code back and then copy / pasting it and grokking / editing it. The author of this seems to suggest that.. there is one correct flow, his flow (which he doesn't entirely detail) and everything else is not appropriate. He doesn't go into what his process is when the LLM hallucinates either. Not all hallucinations show up in static analysis.
Yes! It's amazing how even in a field that tends to lean more early adopter than average you still get a lot of the default knee-jerk dismissal and cynicism - even when it's something clearly amazing and useful as thinking machines.
We're in the middle of a major shift - there will benefits to those that adapt first. People outside the field have no idea what's coming, even those of us in the field are underestimating the shift.
There were a few outliers in the 60s who understood what the computing revolution meant and would mean, but most did not. This is likely an even bigger change than that.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write.
For most software projections, they cannot even remotely do that - at least not today and as far as I can tell.
And that is to say, it's not that they can write a not-so-large fraction. They cannot write any significant fraction of it. They can often write you a draft that helps you get there faster, though. They can also facilitate web search result consolidation etc. which can also help. But - they can also lead you down wrong paths and dead-ends, on occasion.
"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing"
I am actually doing this the whole day long. For example i have setup today a fresh new debian vps for some interns. U had to provide them with a docker system, support for go, nginx stuff and i made a quick hello world app in angular with a go backend. I could have done it myself. But i asked chatgpt to provide me with all the commands and code. No idea how an agent could do this for me. I got everything running in like 30 minutes.
If you’re leaning out, spend two weeks leaning in.
I did, and learned a ton, and likely not going back to how I was before, or how I used it a week ago.
The comments in the article about not reading the agent is good but it’s more than that…
Vibe coding is for non-coders. Yet, you get a feel for the vibe of the AI. With windsurf, you have two or three files open, and working in one. It starts smashing out the multi, interspersed, line edits and you know with a flutter of your eyes, it’s got your vibe and correctly predicted your next ten lines. And for a moment you forgive it for leading you astray when you read what it said.
Here's an idea... imagine you are just starting, you know nothing... an AI can and will help you.
Tell me you won't use it and explain properly, without appeal to "legacy" or "human values" or other abstracts which don't self validate.
Now note that this is POV for every child on earth.
The new discipline is provider of LLM/AI guard rails, and absolutely requires knowing what is/not correct.
I don't doubt that LLMs will have a large impact on software and programming. But I do wonder whether it will be a positive impact. I think one of the biggest problems the software industry has is that it's far too rare for the people building software to have deep explanatory knowledge about what they have built. LLMs threaten to make this problem vastly worse than it already is.
the reason this is rare is that no one stays on any one project long enough for this to be a reality anywhere. you know anyone working 15-20 years on the same project? I’ve been on many projects where deep explanatory knowledge existed through and through until the project was initial-code-complete and then people left to do other amazing things leaving project with people with limited knowledge of it. I suspect this is the case for vast majority of software
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
I think this is my key point of disagreement. Professional software developers often don't see themselves as artisans, because Steve Jobs's unseen feet are implemented through our implicit standards of ourselves. I don't go out of my way to be a perfectionist on the job, but neither will I try to push 1000 lines of spaghetti to solve a minor problem that seems like it ought to be a simple fix.
I've worked with people in the past who don't have this instinct, and it's my experience that their productivity is quite low if not negative, even though they often perceive themselves to be 10x developers who can handle all kinds of edge cases super fast. I suspect that most AI-generated code today ends up in this niche.
Here's what to do: Show me a video of LLM fixing four filed issues in the KiCad codebase.
If you do that, I'll swallow my AI skepticism.
I would love to have an LLM that I can turn loose on an unfamiliar codebase that I can ask questions of. I would love to have an LLM that will fill in my Vulkan boilerplate. etc.
I use emacs and Mercurial. You can demonstrate magic to me and I can be convinced even if it's not mainstream.
Rewriting Javascript slop to StackOverflow standards is not convincing me.
Get to it.
(The OAuth stuff posted earlier certainly moved my needle, but the fact that they needed a gaggle of reviewers as well as hand holding when the LLM got stuck mutes the impact significantly.)
i am no clearer on what setup the author is using? are you hand coding all these non-code agents you mention? You talk about zed, then about cursor - don't they both do the same thing, or are you using 2 IDEs to write your code? What tooling is used for these async agents that are doing things while i eat my breakfast?
I really want to like AI assisted coding, but I guess i am just stuck as to what to do.
I have tried co-pilot, both 2 years ago, and recently. I have tried some other AI plugin on VScode (i thought that was cursor, but I don't see it in the extensions when i now search, so it was probably something else). If anything, I have found with these solutions that they are great for the first few tasks - generate these models, scaffold this, add oauth, etc - but after that they lack all consistency with the code they themselves generated (like if i add a new model, the output is wildly different to what it already created) and it takes more time to rewrite that output than to copy-paste and change whats already there.
@op - could you explain exactly what tooling you are using, maybe give an example of these async agents, etc? Would be most appreciated for us luddites who would love to be more productive.
I agree wholeheartedly with this piece, but I think it might've been taken better if it wasn't written in the style of an HN comment. That being said, definitely doesn't deserve so much flaming in the comments, sorry to see that, because it's basically correct, with some small things I disagree with here and there. Feels exciting to be on the precipice of having a Star Trek style spaceship computer at my fingertips.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write.
But, you still have to read it:
> Reading other people’s code is part of the job...I have to read the code line-by-line anyways.
So instead of writing the tedious code, I only have to read it. Oh but don't worry, I don't have to read it too carefully because:
> Agents lint. They compile and run tests. If their LLM invents a new function signature, the agent sees the error
But remember...
> You’ve always been responsible for what you merge to main.
So now I have to oversee this web of agents and AI ontop of coding? Am I doing more now for the same pay? Am I just speedrunning myself toward lower pay? Is AI adoption a prisoner's dilemma toward lowing my wages hardest?
Because is good at coding compared to many other disciplines (e.g. math), it makes the internal AI politics among programmers more of an issue. Add fuel to that fire baby!
I've observed that AI excels at creative work, but not exacting sciences. For example, If I say, "Write the code to put the button in the upper right corner," I'll almost certainly get a button in the upper right.
Conversely, if I say, "Put the button in the upper right corner exactly like the existing button in the upper left," I'll get a button in the upper right, but not exactly like the upper left button.
This is the same as every other transformative tool which has come to market. You've got a certain segment of the population who will embrace it and push the boundaries of what is possible with the current tools and those in the future. And you've got the curmudgeons who won't let go of the one tool and one non-cloud hosting solution they so desperately cling to and deny the utility of anything else.
AI makes me sad. When I started my CS degree, I didn't even know what silicon valley was. I was unaware of what the SWE job landscape was like. I went to school in a no-name town.
Computer science was an immensely fun subject to learn. I moved to one of the big cities and was bewildered with how much there was to learn, and loved every second of it. I gradually became good enough to help anyone with almost anything, and spent lots of my free time digging deeper and learning.
I liked CS and programming - but I did not like products built by the companies where I was good enough to be employed. These were just unfortunate annoyances that allowed me to work close enough to what I actually enjoyed, which was just code, and the computer.
Before LLMs, those like me could find a place within most companies - the person you don't go to for fast features, but for weird bugs or other things that the more product-minded people weren't interested in. There was still, however, an uncomfortable tension. And now that tension is even greater. I do not use an LLM to write all my code, because I enjoy doing things myself. If I do not have that joy, then it will be immensely difficult for me to continue the career I have already invested so much time in. If I could go back in time and choose another field I would - but since that's not possible, I don't understand why it's so hard for people to have empathy for people like me. I would never have gone down this path if I knew that one day, my hard-earned-knowledge would become so much less valuable, and I'd be forced to delegate the only part of the job I enjoyed to the computer itself.
So Thomas, maybe your AI skeptic friends aren't nuts, they just have different priorities. I realize that my priorities are at odds for the companies I work for. I am just tightly gripping the last days that I can get by doing this job the way that I enjoy doing it.
I recommend reframing this.
LLMs don't make your hard-earned-knowledge less valuable: they make it more valuable.
You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise.
If you don't want to do that then I guess you can find another career - but if you switch careers because you incorrectly think that LLMs make programming experience less valuable you would be making a big mistake in my opinion.
I agree with your assessment of the value of the skills, at least for the time being. What I dislike is the way that we are being encouraged to work now. I simply do not find any joy, at all, in reviewing LLM-written code and then typing in the next prompt.
A sentiment I see often is that it's work, it's not supposed to be fun, and you work at the pleasure of the employer. And I accept that. But I still am really just crushingly sad that this is what my job is becoming.
In the article, Thomas wrote:
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.
I see this as a massive downside, because I loved writing tedious code. I loved reading docs on something I previously didn't understand. I loved forming the mental models strong enough to say "yeah I see why that's there" in the previously-inscrutable APIs of the frameworks and such that I was using. It was precisely the _way_ that I approached that work that allowed for that knowledge to accrue. It was because I almost never just copy/pasted something without spending a lot of time to understand it.
I do some of the same with ChatGPT. I type the code in myself after trying to internalize the ChatGPT response. But even that is starting to feel like company time-theft, as the attitude is shifting even further away from "knowing how to do things is good" toward "getting shit done is all that matters."
> You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise
Since a big majority of companies stopped hiring juniors, where is the new blood coming from when the inevitable more seniors retire?
[dead]
I think the important thing here is you're being honest about how you're feeling. You bring up a very real anxiety and possibility and even folks who are using LLMs probably feel some degree of this alienation. That LLMs are yet another tool to push us to move as fast as possible rather than have our brains get into the nooks and crannies of hard problems that may take longer but are more rewarding to us.
But again, you're being honest. The problem with a lot of the AI skeptic arguments I see is a lack of this honesty. Others have noted that there are a lot of contradictory skeptical arguments, and I suspect the contradictions come because the authors have negative emotions about AI which they're using to create negative arguments.
I do fall into this category of people that are seen as heavily abusing copium. I can admit that when I do get unsatisfactory results from a prompt session, a lot of it has to do with the mental friction I feel at the idea of letting something else write my code.
It again is coming back to my opinion the LLMs have recreated the job in such a way that it emphasizes what I disliked most, and de-emphasizing what I liked. It emphasizes "the goal" and de-emphasizes "the process". We had a period in the 10's where the process (namely, becoming adept at using and learning an ever-changing set of open source tools) was a bit more celebrated. You could justify a lunch-and-learn on things like man pages, commit isolation levels, or package manager - and doing something like that would be seen in a positive light. And now, why would you waste everyone's time talking about something that ChatGPT can figure out for you?
Anyway, thanks for your time in your response.
At first I didn’t think LLMs were that impressive, but ever since I started using them, I’ve pretty much stopped writing for loops by hand. It’s not that I forgot how. I just don’t feel like doing it anymore. The time it saves me on boilerplate lets me focus on other things. Sometimes the code isn’t perfect and I need to fix it, but it’s still way easier than starting from scratch.
99% of the code I write today does not use for loops. I don't need an IA for that, just C# or any map function.
> Extraordinarily talented people are doing work that LLMs already do better, out of spite.
So what, people should just stop doing any tasks that LLMs do subjectively better?
I don't know the full answer to this question, but I have a partial answer: they should at least stop doing tedious tasks that LLMs do better.
Unrelated to your friends, but a big part of learning is to do tedious tasks. Maybe once you master a topic LLMs can be better, but for many folks out there, using LLMs as a shortcut can impede learning.
1 reply →
Some of us thrive in tedium, and also do it better than bots.
1 reply →
"Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?" Good point! Treat AI generated code as if somebody else had written it. It will need the same review, testing and refactoring as that.
This blog post makes me actively want to use something besides fly. Putting aside the massive condescension in the article…
… most of the code you write is tedious boilerplate? What do you do all day? I can honestly say the most tedious code I write is migrations, because I have to edit three files. I do this like once a week. It would take way longer to explain to an ai what to do than just do it myself.
… you run the ai on your procrastination and as a result do heads down work? Seriously? 1, programming with AI is like playing a slot machine. It keeps you hooked just on the verge of winning the jackpot for far longer than the tedious task would have distracted you. And let’s say a good agentic case - it runs for 30 minutes before it’s done. That’s not even enough time to “boot up” a serious problem, and your already having to put it aside to go check the work you yourself are saying is mediocre? And you’re advocating to run multiple of these? Even the most ardent vibe covers I know talk about babysitting the agents. How is this unlocking deep work time in any way shape or form?
… craft is relagated to hobbies? I guarantee you, this person loves Linear. It was probably written on a MacBook. This whole “we solve real problems” BS is just a cop out because being great is hard.
… mediocre code is good code? All code is a liability. If you generate 10 times the code, and are admitting it’s the lowest bar of quality possible, it is at a minimum 10x the liability on your shoulders.
… letting LLMs refactor your tests? Serious question, who actually does this? Because I would LOVE to have LLMs write tests for me. I have tried every. single. time. I need to write something that needs a lot of tests to ask LLMs for tests for TWO YEARS, and not once have I ever used a single test it generates. It generates worse tests than it does code, and the last thing I want is 42 asserts in 8 tests across 600 LoC that nobody understands that poorly tests what should have been 2 asserts in 2 tests in 12 lines of code.
I think he's pretty much right, but it just leaves me managing an AI and hopefully having fewer offshore developers and faster turnaround. So something less than a 50% boost, and kind of variable, but still significant. I don't really think it's world shattering because most software is deeply constrained by business processes (deciding what to build, etc.) that this doesn't help.
It depends on what level of nuts you mean. Some are AGI skeptics about LLMs, theyre probably right, there is likely more breakthroughs required before true AGI. But AGI isn't required to completely disrupt a ton of good, well-paid professions. That is the more worrying scenario. AI is already widening the wealth gap irreparably and with more progress it will only continue.
> but it’s bad at rust
Is it really? I scraped all HN comments with the words "Rust" and "LLM" from the last 6 months, put it into Gemini and asked it to surface quotes of people saying they had a good time, versus quotes of the opposite. It was about 60% positive and 40% negative.
Does anyone have recent personal experience with this they can talk about?
This sounds very cool - almost science-fiction in parts. I want to know more details about the specific technology, whether it could handle a 30k C++ codebase, how much work you would need to do to make it learn how to run your tests, analyse their output, interface with all your internal software development services etc.
For writing, I do a back-and-forth about what I'm trying to do with Claude before asking it to write anything. Produces much less "that's not exactly what I was looking for..."
I was quite happy and entlohntes when I dropped an HTML table Formates with Emojies into my prompt and told to turn it into a terraform module.
But how do you quality check this part? I am proofreading all these transformations and I am not sure if this is really faster than actually recording a VIM macro, which will do the steps I need.
This article doesn’t address literally THE argument I have against LLM usage: that it will degrade major human skills like reading, writing, and yes, coding
But I’m at least grateful for LLMs making me more aware of how important it is for me to focus on my own skills building. I will never use it because I value having those skills sharp
tptacek, curious question: what agent / stack do you currently use?
I use Codex CLI for casual stuff, because of the ergonomics of just popping open another terminal tab.
I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.
I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon. The idea of just brainstorming a bunch of shit --- I can imagine keeping focus and motivation going long enough to just rattle ideas off! --- and then making coffee while it all gets tried, is super appealing to me.
I'm curious how much you paid in the past month for API fees generated by these tools. Or at least what order of magnitude we're talking about.
3 replies →
> I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.
You're not concerned about OpenAI or Google stealing your code? I won't use VSCode for that reason, personally, but I do use VSCodium.
2 replies →
> I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon
Bunch of async prompts for the same task? Or are you parallelizing solving different issues and just reviewing in the afternoon?
Sounds intriguing either way.
1 reply →
Thank you!
Seconded. I am very much still in the mode of copying from the chat window and then editing. I would like to have whatever she is having.
Try getting your fav LLM to do something useful with PowerCLI. That's the VMware PowerShell thing. It is quite popular but not enough, so that most prompts will end up with bollocks, masquerading as code.
When a LLM has been tamed to the point that I can define and request a PowerCLI programme, I can be fairly sure that AI is a solved thing.
I'll be a happy and ignorant nut as long as these models are trained on publicly scraped code without their creators' consent, and the models fail to cite the licenses and sources of the parts the answer is composed from.
Oh, GH Copilot was adding citing capabilities which was going to be in production around this time? Any progress on that front?
Thanks.
> We’re not East Coast dockworkers; we won’t stop progress on our own.
we could choose to be
of course if you're a temporarily embarrassed billionaire like ptacek, you certainly don't want the workers doing this
Yeah, this is where I went from being on the fence to just checking out entirely. Even if it did mean that the top earners make slightly less (which I'm dubious about) and I happened to be one of them (which I'm not confident in, but I could comfortably survive with a lower salary than I've actually received most of the time), I'd still be in favor of tech workers unionizing because the benefits outweigh the costs both personally and for the larger group.
I've been hired to a fully distributed team only to have the company decide to force everyone to "return" to the office within a couple years despite the team never having been in any office in the first place; I've had managers promise me raises that don't appear, only for me to find out later that they never actually talked to their superiors about it like they claimed; I've seen teammates get hired and laid off six months later the same week that we have someone else we just hired join the team. None of this stuff is reasonable, and for some reason we collectively just throw up our hands and say there's nothing we can do, and then apparently some of us scoff and look down at lower-paid workers banding together to try to get a semblance of the advantages we take for granted in our industry because they're "stopping progress"? Even if I did accept the premise that workers collectively organizing to win rights affected "progress", there's no reason to assume there's no middle ground between the extremes of "no progress" and "full-throttle progress without regard for consequences". It's farcical.
It’s the class interests that make it very obvious who is trying to lie to you about AI.
The Luddites were of course correct, like most labour movements.
Kind of saying the quiet part loud here.
Why not?
2 replies →
This is pretty terrible for the fly brand TBH. Yeesh, politics in a company blog...
Adding fuel to the fire. I'm extremely disappointed to see such an inflammatory article on fly.io. I wouldn't want this on my personal blog, let alone on my employers, but I guess they're fine with it.
I've been using Zed and Claude Sonnet 4 (and sometimes trying Opus) heavily over the past weeks. For small edits where I have lots of unit tests, the results were great. So great that they worry me with regards to job security. For exploring a new programming domain it was also somewhat useful. I work a lot with the Typescript compiler API right now, and it has almost no documentation. Since the AI can see into every GitHub repository out there, it's much better, and more efficient, at learning APIs based on code from other folks. On the other hand it means I don't do that job, and I am forced to rely 100% on how the AI presents the Typescript compiler API to me. Are there better methods I could use? Who knows.
Where it's abysmal is code architecture. Sometimes it's almost comical: it adds an if statement to handle one highly specific edge case in a program that only makes sense if it solves the general case. This didn't happen often thought.
The hardest part was to force it to reuse existing code from the same file. My use case is transforming a Typescript AST into a GraphQL AST. The code is one big switch statement with lots of recursive calls. The AI would often add 300 lines of code that duplicate some logic which already exists somewhere else.
In the end I rewrote the whole thing from scratch. At around 900 lines of code the AI was starting to really struggle. When I wanted to take over, I realized that I didn't have the in-depth knowledge to do so. And trying to understand the code the AI had written proved futile.
Ultimately that's on me, I should have been more diligent reviewing the dozens of 300 line of code changes the AI throws at me over the course of a day. But I wasn't, because reviewing is really, really hard. For many reasons. And AI makes it even harder.
Am I therefore nuts? I find this whole article extremely one sided. Surely, based on the sheer amount of both positive and negative press, the answer is somewhere in the middle.
Having the same exact experience as you.
Ask me again in 15 years. Assuming the world hasn’t already entered a war for the remaining resources on this planet.
>Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields.
What a strange disclaimer. Image generation was a (career)killer feature long before vibe coding was halfway useful.
Can someone explain to me what this means?
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. ...
Is this what people are really doing? Who is just turning AI loose to modify things as it sees fit? If I'm not directing the work, how does it even know what to do?
I've been subjected to forced LLM integration from management, and there are no "Agents" anywhere that I've seen.
Is anyone here doing this that can explain it?
I cut several paragraphs from this explaining how agents work, which I wrote anticipating this exact comment. I'm very happy to have brought you to this moment of understanding --- it's a big one. The answer is "yes, that's exactly what people are doing": "turning LLMs loose" (really, giving them some fixed number of tool calls, some of which might require human approval) to do stuff on real systems. This is exactly what Cursor is about.
I think it's really hard to undersell how important agents are.
We have an intuition for LLMs as a function blob -> blob (really, token -> token, but whatever), and the limitations of such a function, ping-ponging around in its own state space, like a billion monkeys writing plays.
But you can also get go blob -> json, and json -> tool-call -> blob. The json->tool interaction isn't stochastic; it's simple systems code (the LLM could indeed screw up the JSON, since that process is stochastic --- but it doesn't matter, because the agent isn't stochastic and won't accept it, and the LLM will just do it over). The json->tool-call->blob process is entirely fixed system code --- and simple code, at that.
Doing this grounds the code generation process. It has a directed stochastic structure, and a closed loop.
I'm sorry but this doesn't explain anything. Whatever it is you have in your mind, I'm afraid it's not coming across on the page. There is zero chance that I'm going to let an AI start running arbitrary commands on my PC, let alone anything that resembles a commit.
What is an actual, real world example?
4 replies →
> Is this what people are really doing?
Some people are, and some people are not. This is where some of the disconnect is coming from.
> Who is just turning AI loose to modify things as it sees fit?
In the advent of source control, why not? If it does something egregiously wrong, you can throw it away easily and get back to a previous state with ease.
> If I'm not directing the work, how does it even know what to do?
You're directing the work, but at a higher level of abstraction.
> You're directing the work, but at a higher level of abstraction.
The article likens this to a Makefile. I gotta say, why not just use a Makefile and save the CO2?
1 reply →
This is how I work:
I use Cursor by asking it exactly what I want and how I want it. By default, Cursor has access to the files I open, and it can reference other files using grep or by running specific commands. It can edit files.
It performs well in a fairly large codebase, mainly because I don’t let it write everything. I carefully designed the architecture and chose the patterns I wanted to follow. I also wrote a significant portion of the initial codebase myself and created detailed style guides for my teammates.
As a result, Cursor (or you can say models you selecting because cursor is just a router for commercial models) handles small, focused tasks quite well. I also review every piece of code it generates. It's particularly good at writing tests, which saves me time.
Zed has a great four minute demo showing how it works: https://zed.dev/agentic
I personally have my Zed set up so the agent has to request every command be manually reviewed and approved before running.
I run Cursor in a mode that starts up shell processes, runs linters, tests etc on its own, updates multiple files, runs the linter and tests again, fixes failures, and so on. It auto stops at 20 iterations through the feedback loop.
Depending on the task it works really well.
This example seems to keep coming up. Why do you need an AI to run linters? I have found that linters actually add very little value to an experience programmer, and actually get in the way when I am in the middle of active development. I have to say I'm having a hard time visualizing the amazing revolution that is alluded to by the author.
2 replies →
I let an agent upgrade some old C code that wouldn’t compile and had 100’s of warnings. It was running builds on its own, looking at new errors, etc. It even wrote some tests! I could’ve done this myself but it was a hobby project and tedious work. I was impressed.
you are giving it instructions but it's running a while loop with a list of tools and it can poke around in your code base until it thinks it's done whatever you ask for.
See Claude Code, windsurf, amp, Kilcode, roo, etc.
I might describe a change I need to have made and then it does it and then I might say "Now the tests are failing. Can you fix them?" and so on.
Sometimes it works very great. sometimes you find yourself arguing with the computer.
This is also where I am, and I guess it has been a source of mild and growing consternation since I first blagged an OpenAI GPT account when they were private, in an attempt to get ahead of what was coming -- both the positive and negative sides of the advances. Most people either ignored the advances, or quickly identified and connected to the negative side, and effectively filtered out the rest.
As somebody who comes from a politically left family, and was also around in the early days of the Web, let me tentatively note that this issue has a particular political slant, too. The left has strong roots in being able to effectively critique new developments, economic and social, that don't come from its own engines of innovation which have revolved around those critiques, plus solidarity, organization, and sociopolitical action.
The movement's theorists work far more slowly on how to integrate the effect of those changes into its vision. That means when something like this comes along, the left's cultural norms err on the side of critique. Which is fine, but it makes any other expression both hard to convey, and instantly suspect in those communities. I saw this in the early Web, where from a small group of early adopters of all political slants, it was the independents, heterodox leftists, and the right, -- and most vocally, the libertarians -- who were able to most quickly adapt to and adopt the new technology. Academic leftists, and those who were inspired by them took a lot longer to accomodate the Net into their theses (beyond disregarding or rejecting it) and even longer to devise practical uses for it.
It wasn't that long, I should say -- a matter of months or years, and any latent objections were quickly swamped by younger voices who were familiar with the power of the Net; but from my point of view it seriously set back that movement in practicality and popularity during the 80s and 90s.
I see the same with AI: the left has attracted a large generational of support across the world from providing an emotionally resonant and practical alternative to the status quo many people face. But you quickly lose the mandate of heaven if you fail to do more than just simplistically critique or reject a thing that the average person in the world feels they know better, or feels differently toward, than you do. This is something to consider, even if you still strongly believe yourselves to be correct in the critiques.
Hey, thanks for this. It helps with some stuff I've been thinking about. I appreciate it a lot, gonna do more thinking.
That's funny! I mulled it a bit more after reading your piece, and I think some of the comments on lobsters that were considering the whole issue in a wider politicL framing. Thanks for replying!
I never used AI, microservices, blockchain, Mongo, and used only 2 js frameworks in my carrer.
What is the current meta on a fully agentic workflow as he describes it? Wake up, write a vague prompt, go for coffee, and get a PR to review on your phone? I'm genuinely interested to see what going all in on this looks like.
I've yet to see an LLM produce anything but disastrous bugs. Remember they don't know they're writing code that can destroy companies and livelihoods. They are just spitting out plausible but always wrong text.
> “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.
i don't get it. It keeps making stuff up or gives me wrong stuff that merely compiles when i tell it signature doesn't exist.
I tried Copilot many times and found it lacking. I like discussing concepts (maybe with pseudocode) with Gemini. I find this useful but not game changing.
So genuine question for AI Boosters, what should I be using?
I would like the 10x productivity gains we were promised :)
Someone at work said something to the effect of "Oh, this would be great for an intern but we don't have one this year" to which I responded "sure you do, its name is Copilot".
Every time I read one of these it feels like I'm reading an AI generated sales pitch for AI.
This is very well written article. I love the style: aggressive, youthful, humorous, informed, experienced.
I think it could do with a little bit more friendly, but… it’s got its own charm.
It also convinced me I should be using agents rather than raw dogging Gemini 2.5 pro
> You’ll spend 5-10 minutes knocking it back into your own style
You lost me here. I have often found it to be far more than a 10-minute style issue, but fundamental misunderstanding of the code purposes that I need to fix.
One thing that really bothered me that the author glossed over (perhaps they don't care, given the tone of the article) is where they said:
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic.
But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.
We're robbing ourselves of good future developers, because we aren't even thinking about the fact that the junior devs are actively learning from the small tasks we give them.
Will AI completely replace devs before we all retire? Maybe. Maybe not.
But long before that, the future coders who aren't being hired and trained because a senior dev doesn't understand that the junior devs become senior devs (and that's an important pipeline) and would rather pay $20/month for an LLM, are going to become a major loss/ brain drain domestically.
I think what is going to happen is that junior devs will develop a strong reliance on AI tools to be able to do anything. I cynically think this was OpenAI’s aim when they made ChatGPT free for students.
I had a rather depressing experience this semester in my office hours with two students who had painted themselves in a corner with code that was clearly generated. They came to me for help, but were incapable of explaining why they had written what was on their screens. I decided to find where they had lost the thread of the class and discovered that they were essentially unable to write a helloworld program. In other words, they lost the thread on day one. Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
From one perspective I understand the business case for pushing these technologies. But from another perspective, the long term health of the profession, it’s pretty shortsighted. Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,” and maybe that will leave me with the group that really wants to be there. In the meantime, I will remind students that there is a difference between programming and computer science and that you really need a strong grasp of the latter to be an effective coder. Especially if you use AI tools.
> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”
I see this so much. “Data science major” became the 2020s version of law school. It’s such a double edged sword. It’s led to a huge increase in enrollment and the creation of multiple professional masters programs, so the college loves us. We hire every year and there’s always money for just about anything. On the other hand, class sizes are huge, which is not fun, and worse a large fraction of the students appear to have minimal intrinsic interest in coding or analyzing data. They’re there because it’s where the jobs are. I totally get that, in some sense college has always been that way, but it does make me look back fondly on the days when classes were 1/4 as big and filled with people who were genuinely interested in the subject.
Unfortunately I think I may get my wish. AI is going to eliminate a lot of those jobs and so the future of our field looks a bit bleak. Worse, it’s the very students who are going to become redundant the quickest that are the least willing to learn. I’d be happy to teach them basic analysis and coding skills, but they are dead set on punching everything into ChatGPT.
> I cynically think this was OpenAI’s aim when they made ChatGPT free for students
Is there any interpretation that makes sense _other_ than this?
2 replies →
> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
This is nothing new. In a computer graphics class I took over 20 years ago, the median score on the assignments before the midterm was >100% (thanks to bonus questions), yet in midterm prep other students in the class were demonstrating that they didn't even have a firm grasp on the basic concept of a matrix.
That is: they were in a 4th year undergrad course, while doubting material from a senior year high school course where they had to have gotten high marks in order to get into the program.
And the midterm grading was heavily curved as a result (though not as much as in some other courses I took).
Students will do what they need to do for the grade. It seems a great many of them have internalized that none of this is about actually learning anything, even if they would never say so aloud. (I learned things - where I didn't already know them - because it was actually interesting. My resulting grades were pretty good overall, but certainly not top of class.)
> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”
Why would it? It's becoming easier than ever to fake understanding, and to choose anything else they would need both the opportunity and social permission. I only see the problem getting worse.
> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.
From a student's perspective: I think it was the same with SO. While LLMs make c&p even easier, they also have the upside of lowering the bar on more complex topics/projects. Nowadays, the average person doesn't touch assembly, but we still had a course where we used it and learned its principles. Software engineering courses will follow suit.
4 replies →
Hard capitalism doesn't care about long term perspectives, the only yard stick is current performance and stock maximization. Otherwise US would a bastion of stellar public education for example, the investment in long term future of whole nation instead of few richest ones sending their kids to private schools, to stay above the rest.
So while I fully agree with you, this is not a concern for a single decision maker in private company world. And state such as US doesn't pick up this work instead, quietly agreeing with this situation.
Well, think for a second who makes similar budget and long term spending focus. Rich lawyers who chose to become much more rich politicians, rarely somebody else and almost never any more moral profession.
It's a bit misleading to compare $20/month with an actual human person. The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.
Cursor is a heck of a lot more than $20/month if you actually want it working for a full work day, every day.
> The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.
This issue manifests a bit differently in people, but I've definitely worked with people (not only juniors) who only have a few productive hours a month in them. And for what it's worth, some of those people were sufficiently productive in those few hours that it was rational for the company to keep them.
I worked on a team where a new hire from a prestigious school told his manager "That work is boring and mundane and I'm not going to do it."
He didn't last long.
2 replies →
My coworkers are burning 10k/month on cursor.
1 reply →
Maybe - but it will still probably be less than a junior dev.
You could probably hammer the most expensive cursor API all-day every-day and it would still be a fraction of the cost of a junior dev.
Further, Cursor might cost $20/month today, but to what degree is that subsidized by VC investment? All the information we have points to frontier models just not being profitable to run at those types of prices, and those investors are going to want a return at some point.
the market will indeed balance this out. remember when a taxi was $20 and an uber $5? now an uber is $25. nobody is going to go back to humans with all their wet meat sack problems, we will get more value for it, but it aint gona stay $5 if those putting up all this capital have anything to do with it. then again, we might get cheap, self hostable local copies (unless theyre made illegal for "safety" or some bullshit)
4 replies →
I dunno, with the advances in open source models I could see in a few years having AI workstations that cost $20,000 with 1TB of VRAM so you don’t have to rely on OpenAI or Cursor. The RTX 6000 Pro is only $7500 and has 96GB of VRAM.
This is something that's been simmering in the back of my mind for a while. Using an AI Agent instead of talking to your human colleagues deprives both of you from learning opportunities. There are probably short term gains in many cases, but I fear there will be long term losses over time.
I agree, and think that organizations that figure out how to use AI well in a collaborative way will succeed in the long-term. Developer community is still where the important growth happens.
Is it possible to talk to coworkers? What if you voted for wrong party? Are closeted gay/trans/qeer? Radical femimist?! Or dog atrack survivor, and they really _REALLY_ like dogs!
Talking to colleagues at work is a chore, and huge risk! Not opportunity! At least AI respects my privacy, and will not get my fired!
1 reply →
> But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.
Almost every senior developer I know is spending that time making LLM's more productive and useful instead.
Whatever you think the job is of the senior developer, it will not be "coding".
I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_. Maybe that is through writing code, but maybe it's orchestrating LLM's to write code for you. The important part is solving the problem.
> Almost every senior developer I know is spending that time making LLM's more productive and useful instead.
LLMs may become more productive/ accurate/ useful, but they're not self-tasking or independent.
> I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_.
There is a progression of skill required to master any profession, starting with fundamentals, and progressing and developing until you are an expert/ senior at that profession. How is a senior sw dev supposed to become that without writing code? Just reading LLM code and bugfixing isn't the same level or kind of experience. You're going to have devs who can't code by themselves, and that's a bad place to be in.
There are already too many people in IT using tools that they don't understand the workings of (and thus can't troubleshoot, can't replace, can't customize to their env, etc), and this will just exacerbate that x100.
MMW there is going to be a very bad skill deficit in IT in 20 years, which is going to cause an innovation deficit.
How many senior developers understand the minute, intimate details of the frameworks, libraries, languages they use? How many understand the databases they use? TFA says, many (but not all) don't have to care as long as the product ships. That's exactly how code written by LLMs is meant to be tested and evaluated. And if you set up a good enough build/test environment, TFA argues that you can automate most of the schelp away.
In my experience, senior engineers without the curiosity to occasionally dig deeper into their frameworks are significantly worse engineers. No framework, library, or language is perfect. A big part of your job is understanding how your code will execute. Sometimes that even requires, at a very high level, imagining how that code will compile down to assembly once you strip away all the abstractions.
Eventually you will get a memory leak even in a GCd language. Eventually there will be some incredibly obscure, unreported bug in a library. Eventually you will find an issue in unmaintained code you depend on. Eventually there will be performance problems caused by too many layers of abstraction.
You either need to know, roughly, how your dependencies work by occasionally digging into their code/reading the documentation. Or you need intuition to know how it probably works, but you usually build that intuition by actually writing/reading code.
3 replies →
Well i hope that happens, you go apply for a job already 10k applied. What kind of job market is this.
>We're robbing ourselves of good future developers
You call it robbing ourselves of good future developers, I call it hourly consultancy rate increase.
I imagine it like this. Juniors will be taught by LLMs on some things, but seniors will still be there, they will still assist, pair program, code review, etc. but they will have another party, the LLM, like a smarter calculator.
Eric S. Raymond is now doing 100% of his coding with AI assistance, and claims he is much faster and more productive.
I think we've passed the threshold beyond which not using AI is just doing things the hard way for no appreciable gain.
> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
> The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.
So it starts with a humble strawman, and then the author illustrates how dumb they think their fellow developers really are if they have the make different decisions based on different values.
Most developers I interact with who are what you might call 'AI skeptic', are most concerned with how this tech will be used, especially since its creation is rooted in mass scale disregard for ownership of anything.
> "The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site.... They stand up global-scale piracy networks",
IMO the author is here projecting or something, cause I literally never met someone with this opinion, and I've also been in tech/developer circles for over 20 years. Personal use of pirated content is very different from commercial distribution or making money of share sites. Not everyone's values are so rooted in making money with complete disregard to the impact of their actions in doing so.
I get it, the author wants to keep using LLMs are for people to stop trying to make them feel bad but trying to make a case for how their arguments are just dumb. But the author completely missed the 'why' behind the arguments. In the end, LLMs are a tool. Use them or not is up to the individual. But that doesn't give a blanket social license to use them in any way people, or more importantly, companies want.
https://www.nytimes.com/2025/06/02/opinion/ai-coders-jobs.ht...
>People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. It’s not Yacc output. It’s knowable
okay wtf is this weird jab at yacc supposed to be? I generally prefer to write my own LR parsers because it's really not that difficult but I never felt like yacc code was difficult to read or not "probabilistic".
AFAIK most serious compilers don't use yacc either because they need to be able to parse context-dependent grammars, and also reinventing wheels is often worth the effort when the wheel is 50 years old and forces you to make compromises because it also has to support use-cases you are not interested in. But again, it's not bad, it does the thing it is designed to do and it does save a lot of effort if you're fine dealing with its limitations.
Only serious complaint I have about Yacc is that the basic posix form has frustrating limitations such as not being able to have two separate yacc instances because they'll declare the same symbols, and I'm fairly certain it's not thread-safe either. Bison fixes some of these while still remaining "probabilistic".
Anyways I'd rather use Yacc+Lex (preferably Bison+flex) to generate my parser than have to go read everything your "agent" wrote with a fine-toothed comb to make sure it didn't introduce any crippling memory corruptions or dangerous ROPportunities.
I am in the middle, neither a dogmatic skeptic nor a full-blown prompt engineer, but I lost it when the author compared a junior developer (a human) to a SaaS subscription.
Tells you what you need to know about the AI culture.
I don’t think it’s an uncommon comparison.
If you don’t consider the human aspect, a LLM coding agent has a similar use than a junior developer.
Of course, humans have advantages and those advantages have value. But the coding agent costs $20 dollars a month.
Replacing humans my machines that are much cheaper isn’t really something new.
What is the agent setup that I should try?
What I tried is indeed "set up Cursor" or "copy-paste ChatGPT", and it was all bad, I have no idea what is this "agent" and "MCP" stupp.
If they’re regurgitating what’s been learned, is there a risk of copyright/IP issues from whomever had the code used for training? Last time I checked, there’s a whole lotta lawyers in the us who’d like the business.
Most of the big AI vendors have liability policies in place where their lawyers are committed to step in and help you out if someone sues you for copyright infringement based on code their LLMs wrote for you. I think the term they use is "copyright shield".
OpenAI has that as point 14 in https://openai.com/policies/services-agreement/
Anthropic mention theirs here: https://www.anthropic.com/news/expanded-legal-protections-ap...
Here's the Google post: https://cloud.google.com/blog/products/ai-machine-learning/p...
Which AI agent is actually good nowadays? I've tried cursor but i didn't want a fork replacing vscode just to be able to call an LLM. And I'd preferably want pay as you go, not a subscription model
That was a stellar read. I've had (at least parts of) many of these thoughts floating around my head over the past few weeks / months, but it'd have been beyond my ken to write them down as lucidly.
I think LLMs are great and patient teachers.
But code is something I like to write myself, it's not about speed of producing code. It's keeping it such you still fully understand it.
Seems to me that AI itself will take some jobs but the cheap workable software the AI creates will take even more jobs, especially if the AI can create workable software for automation robots.
Don't be pennywise and pound foolish...
The LLM coding agents might help you fire some swe to save a penny or two, but soon enough, it will also help your customers or suppliers to replace your business model.
Whilst generative AI is great for writing business letters and essays and all that, going straight for coding, one of, if not the hardest problem is a bit much.
This is a come back in 5 years when it works sort of thing.
I'm a recent convert. I _was_ a sceptic until I wasn't (https://news.ycombinator.com/item?id=44059847).
This isn't about the post itself, but can you please put the author's name at the top? It's written in first person and we have no idea who's talking until the end of the post.
To anyone else who was going to protest that the name _is_ at the top, apparently it depends on one's browser window width and zoom level. I was a bit surprised to find that increasing the zoom level even affects the order when reading with a screen reader.
Sometimes AI model generates technical debt. It happens for instance if AI generates code with dependancies that to old to be automatically be updated to the latest version.
Ironically all nuts friends seem to be talking less, less defensive than other nuts friends who keep talking about other people not buying what they believe nuts.
"You don’t believe in IPR. Then shut the fuck up about IPR."
As a software engineer, I need to call out this rhetoric. We are not all morally bankrupt. Many of the people with the most ethical backbone I know are also software engineers.
I don't use "AI" for the same reason I don't use amphetamine. I'm sure I could get stuff more stuff done faster if I used it, but I don't, because the manufacturing and the distribution of it is completely unethical.
The blatant stealing of intellectual property is only the tip of the shitberg. As bad are the giant ecological footprint[1], the colonialism of the grossly underpaid RLHF, and the conceding of even more of our digital lives to trillion dollar corporations.
We can and must ethically evaluate our work, and that includes our tools. Nobody's perfect, but doing one immoral thing does not mean you get a free pass to skip morality altogether.
1. https://www.greenpeace.de/publikationen/environmental-impact...
This article is ridiculous
Q: but the code that AI generates is incoherent gibberish
Guess what, life’s tough. Fixing incoherent gibberish is part of the job you wuss. Honestly, I bet you write incoherent gibberish. Would you rather be fixing incoherent gibberish that an AI wrote or incoherent gibberish that you wrote?
Q: but I have to re-do everything it touches. Everything takes me ten times as long!
What are you, some kind of vibe-coding YouTuber? Everything taking ten times as long is part of the job, cupcakes. What’s important is that now it takes ten times as long automatically. I like the way Zed does it, where it tells you to leave while it’s changing stuff.
OP opened with an ad hominem argument that his skeptic friends are all nuts.
OP therefore loses the argument.
What's the likelihood that this was 90%+ written by an LLM? It'd be ironic, especially now that there are 2000+ engineers commenting on it.
AI is a tool, akin to a seamstress' sewing machine. Use it well, and with intention, and it'll help speed up your work. That's it.
> but it’s bad at rust
It's not just Rust. It's everything that has been created post the last vacuum. You need a large corpus for LLM to get decent-ish.
I can’t wait for the day when people no longer manually write text messages to each other, but instead just ask LLMs to read and respond from a few prompted words.
next level: use AI agents in their dating apps on both sides to decide if they want to hook up
Just send AI agents on a simulated date and see how it turns out.
this is one of those fascinating cases where i agree with none of the arguments, but vehemently agree with the conclusion… it ordinarily would give me pause, but in this case i am reminded that nonsense arguments are equally applicable to both sides of the debate. if the arguments actually had logical connection to the conclusion, and i disliked the arguments but liked the conclusion, that would be real cause for introspection.
Where is the counter argument to this not being sustainable?
> For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine.
I am curious about this part though
I develop space-borne systems, so I can't use the best LLM's for ITAR/etc reasons, but this article really makes me feel like I'm missing out. This line in particular makes me wonder if my skills are becoming obsolete for general private industry:
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. They also:
Every once in a while I see someone on X posting how they have 10 agents running at once building their code base, and I wonder if in 3 years most private industry coders will just be attending meetings to discuss what their agents have been working on, while people working on DoD contracts will be typing things into vim like a fool
> while people working on DoD contracts will be typing things into vim like a fool
Forget LLMs, try getting Pandas approved. Heck I was told by some AF engineers they were banned from opening Chrome Dev Tools by their security office.
FWIW I think the LLM situation is changing quite fast and they're appearing in some of our contracts. Azure-provided ones, of course.
Frankly, as someone who is engaged in fields where LLMs can be used heavily.
I would stay in any high danger/high precision/high regulation role.
The speed at which LLM stuff is progressing is insane, what is cutting edge today wasn't available 6 months ago.
Keep up as a side hobby if you wish, I would definitely recommend that, but I just have to imagine that in 2 years a turnkey github project will get you pretty much all the way there.
Idk, that's my feeling fwiw.
I love LLMs but I'm much less confident that people and regulation will keep up with this new world in a way that benefits the very people who created the content that LLMs are built on.
> The speed at which LLM stuff is progressing is insane
You clearly haven't been following the space or maybe following too much.
Because the progress has been pretty slow over the last years.
Yes modals are cheaper and faster but they aren't substantially better.
1 reply →
The best I can offer skeptics is the more you work with the tools the more productive you become. Because yes the tools are imperfect.
If you've had a dog you know that "dog training" classes are actually owner training.
Same with AI tools. I see big gains for people who spend time to train themselves to work within the limitations. When the next generation of tools come out they can adapt quickly.
If this sounds tedious, thats becuase it is tedious. I spent many long weekends wrestling with tools silently wrecking my entire codebase, etc. And that's what I had to do to get the productivity improvements I have now.
Thanks for the article. I've been unwilling to try agentic editing for a while, and this is the kick I need to give it a shot.
>but the code is shitty, like that of a junior developer >Does an intern cost $20/month? Because that’s what Cursor.ai costs.
RIP Junior devs
Very weak arguments and intentionally provocative.
Call me old but I am not using AI for coding. It's not what I expect or want an AI to do for me. I would love for it to do stuff I don't like to do. It's like the meme that was floating around: that a woman says it would have loved the AI to do the dishes so she can to art; not the other way around. Yes I am also a skeptic but that is not weird as I have seen to many movies about AI taking over the world. I hope I never have to say: "I told you so". I really do see positive points to AI but just not in coding. I would not even trust them to write tests. Might make them write documentation at some point.
I would like to point out: LLM's look really intelligent until you converse with them in a less popular language, it will make weird mistakes and interpret your questions all wrong. I am Dutch and they really suck at that.
Can someone guide me which AI tools are good? And their docs and tutorial
I want to see the hype but without paying up front. My focus area is c#, .net, js.
You’re the goat, Ptacek.
Seriously, this is the most based defense of LLM that I’ve read over the years. All the yapping in the thread only proves your points.
LLM, please find all occurrences of "shut the fuck up" and suggest actual arguments for the author to use.
Otherwise, well written and persuasive.
Soon all coding will look like L3 support - debugging something you've never seen before, and under pressure. AI is really taking away the fun parts from everything and leaving just the drudgery in place.
“What do you mean you want to think about our architecture? Just get the LLM to do it, and we’ll get it to fix it if anything goes wrong”
“No we’re not allocating any time to thinking about the design, just get the LLM to do it”
I’m so excited for the bleak future.
The key is to figure out how to move up the ladder of abstraction. You don't want to be a "coder" in a world where AI can code, but you do want to be a "person who makes software" in a world where making software just got easier.
Most people who chose the profession don’t want that, though. They like the coding, and dislike managing.
5 replies →
People said the same about VB style coding then low-code and now AI.
They have been wrong every time and will continue to be wrong.
This feels different; I asked DeepSeek R1 to give me an autoregressive image generation code in pytorch and it did a marvelous job. Similar for making a pytorch model for a talking lip-synced face; those two would take me weeks to do, AI did it in a few minutes.
Autoregressive LLMs still have some major issues like over-dependency on the first few generated tokens and the problems with commutative reasoning due to one-sided masked attention but those issues are slowly getting fixed.
4 replies →
There is no AI yet, it’s just automation and that’s why people hate it. It also allows corporations to consume open source code for profit.
> I think this is going to get clearer over the next year. The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality. I’m snarking about these people, but I meant what I said: they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.
Yes, yes, and yes. SOOOO much yes. We are going to be in for a wild ride once everyone gets on board with the reality of coding agents.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
The cost is a pretty obvious indication that it's nowhere near the capability of an engineer. A robotic worker that can produce code 24/7 without tiring would be worth even more than a person of equivalent skill. I don't know what kinds of goofy hiring ya'll are doing, but all of the juniors I've been involved in hiring actually worked out great.
I find it interesting that TFA claims that Go is the most LLM legible language.
I suspect C# Asp.Net Core not so much. Any thoughts?
I don’t use AI and I don’t really care for it, but the language in this article is really bad.
I expected better from Thomas.
I find the claims of 2-5x more productivity insane. I'm at least 10x more productive.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
I was willing to read until I got to this part. He can shove this drivel up his own ass for all I care if that's how he thinks. What a waste of time.
> but the code is shitty, like that of a junior developer > Does an intern cost $20/month? > Because that’s what Cursor.ai costs.
So sick of hearing comparisons between LLMs and Juniors, first of all are we really are seniors? After almost 20 years sometimes I still think I know nothing. Second, when I was a Junior just me being younger, fresher and with an eager learner attidute brought a net positive impact on the team I joined back then, that I still hang out with those people to this day and we still have fun joining new projects together. I'm so sick of these snobbish "seniors" that keep throwing rocks at younger and well prepared people
> Also: let’s stop kidding ourselves about how good our human first cuts really are.
Really enjoyed this article and how it was written, especially the “tookr-jerbs” part.
but I like writing code, rather than reviewing code somebody/thing else wrote
Coming around to this conclusion myself after experimenting with the tools for a few weeks.
Things have changed.
I mean, the root cause of this is quite mundane.
People like to think that intelligence is some intrinsic force but it's far better thought of as situational. We are contextually smarter or dumber based on the emotions we're trying to process from a situation and how our motivation reasoning allows us to avoid certain negative emotions.
The subset of people the OP is trying to argue against are people who are otherwise normally intelligent but have an emotional need for AI to be unimportant that they're unable to process so they build a social community of terrible arguments.
The nature of the internet is that visibility is terribly correlated (and often anti-correlated) with popularity so OP thinks this is some large contingent of people when it's instead algorithmically served conflict generators.
Writing against that is also an algorithmic conflict generator, as evidenced by the number of upvotes here. The correct response is to realize that people putting up terrible arguments about AI don't actually care about AI, they're using this issue to process unresolved trauma from other, unrelated areas of their life and taking their ideas literally instead of seriously is a waste and diversion. There are plenty of smart people with far more nuanced views that still disagree with you where they're interested in the content of the argument.
Spot on. Current gen LLMs also has the ability to mimic your code style. I always point it to existing code, and the result is the same as what I would have written.
Yesterday I “wrote” 18k lines of code with Claude & Cursor. Mostly UI, some backend. This would have taken me a week without LLMs.
I’m sold.
I don't get the focus on the coding. In all the projects I worked on for 25 years, coding has never been the bottleneck in projects. Maybe the projects I work on are simple, maybe I was lucky, but still: once the requirements are clear, coding is easy. And it's literally what takes the least time. I don't want to spend less time on coding thanks to AI. Because coding is what I like, not the rest of the job. Bring me AI that will do the meetings for me please.
>Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
Maybe you are, but I didn't get a Master's degree to produce quick shit for shareholders
*sighs in aerospace*
Let me know when I can use one of these without violating ITAR/EAR.
You can, if you run the models locally yourself on your own hardware, e.g. Ollama on a Mac Mini.
basically nails every point i've wanted to say but couldn't lol. also in the end the endless posturing gets tired and we go back to getting shit done. either it helps you are it doesn't.
I have a trick to improve code quality from LLMs. I only use the ones where I have control of the system prompt.
I use a prompt that convinces the LLM they are a Staff Engineer at Google and care deeply about best practices and writing clean, quality code.
While not perfect, it does improve the code quality.
"they’re smarter than me" feels like false humility and an attempt to make the medicine go down better.
1. Thomas is obviously very smart.
2. To be what we think of as "smart" is to be in touch with reality, which includes testing AI systems for yourself and recognizing their incredible power.
It's not false. He's talking about people smarter than him (at writing and shipping infrastructure code).
Thomas is the smartest at other things.
It is false and you're proving it. Smarter means smarter.
Smarter does not mean "better at writing and shipping infrastructure code."
Some of the smartest people I know are also infra engineers and none of them are AI skeptics in 2025.
i feel like surprisingly, front end work which used to be viewed by programmers as "easier" is now more difficult of the two, because it's where LLMs suck the most
you get a link to a figma design and you have to use your eyes and common sense to cobble together tailwind classes, ensure responsiveness, accessibility, try out your components to make sure they're not janky, test out on a physical mobile device, align margins, padding, truncation, wrapping, async loading states, blah blah you get it
LLMs still suck at all that stuff that requires a lot of visual feedback, after all, you're making an interface for humans to use, and you're a human
in contrast, when i'm working on a backend ticket ai feels so much more straightforward and useful
Programmers who think front end is "easier" than backend have been wrong for well over a decade. https://simonwillison.net/2012/Feb/13/why-are-front-end/
This article is very well-written. Obviously not written by AI.
My take is: It is OK to don't buy into the hype. There's a lot of hype, no denying that.
But if you're actively avoiding everything related to it, you might find yourself in a position where you're suddenly being left in the dust. Maybe not now, not next month, not next year, but who some time in the future. The models really are improving fast!
I've talked with devs that (claim they) haven't touched a model since ChatGPT was released - because it didn't live up to their expectations, and they just concluded it was a big nothingburger.
Even though I don't follow the development religiously anymore, I do try to get acquainted with new releases every 3 months or so.
I hate the term "vibe coding", but I personally know non-tech people that have vibe coded products / apps, shipped them, and make more money in sales than what most "legit" coders are making. These would be the same "idea people" that previously were looking for a coder to do all the heavy lifting. Something is changing, that's for sure.
So, yeah, don't sleepwalk through it.
The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.
It’s not like “becoming skilled and knowledgeable in a language” which took time. Even if you’re theoretically being left behind, you can be back at the front of the pack again in a day or so. So why bother investing more than a little bit every few months?
> The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.
Very much disagree with that. Getting productive and competent with LLM tooling takes months. I've been deeply invested in this world for a couple of years now and I still feel like I'm only scraping the surface of what's possible with these tools.
2 replies →
I think the more "general" (and competent) AI gets, the less being an early adopter _should_ matter. In fact, early adopters would in theory have to suffer through more hallucinations and poor output than late adopters.
Here, the early bird gets the worm with 9 fingered hands, the late bird just gets the worm.
It takes deliberate practice to learn how to work with a new tool.
I believe that AI+Coding is no different from this perspective. It usually takes senior engineers a few weeks just to start building an intuition of what is possible and what should be avoided. A few weeks more to adjust the mindset and properly integrate suitable tools into the workflow.
3 replies →
I thought so about strong static typing looong time ago.
I am still using ChatGPT like a caveman and am happy with it.
Another one, here we go :) I think I'm going to keep it short because I am not here to convince people they're wrong, just to voice my dissent from uh almost every viewpoint this article comes from.
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans.
Well that's just wrong - we have failed to capture tremendous value through knowledge of our craft and to reduce it down to min-maxing effort/profit is not a pursuit that I can get behind. Nobody is talking about building Ivory towers, but we can do a lot better than most of the common denominator technologies in use today. We're failing our users most of the time (and of course people like the author say it doesn't matter because it doesn't show up in the quarterly results).
> Nobody cares if the logic board traces are pleasingly routed.
Speak for yourself.
Both off and on topic: I got access to Claude’s voice mode today, and for my second test I asked it for its thoughts about both the linked blog post and the HN discussion here. The full transcript is at [1]. Some parts that stood out (slightly edited):
Me: So what is your assessment of the discussion at Hacker News?
Claude: The Hacker News discussion strikes me as much more grounded than the blog post itself. What I find compelling is that many commenters are sharing specific technical experiences rather than making broad ideological arguments. When someone describes spending hours fighting with Claude agents that keep refactoring code into spaghetti, that's concrete data about the current state of these tools. ...
Me: Well, what about you, Claude? You are made by a company whose president has recently made some public statements saying that, you know, 50% of white collar jobs will be replaced soon. And the company, Anthropic, also is very aggressively developing and selling tools to help with AI coding. What are your thoughts about that?
Claude: That's a really fair question to ask me directly. I think there's an obvious tension in my position here since I'm built by a company that has clear financial incentives around AI adoption, including coding tools. When I look at my own capabilities honestly, I can see both sides of this. I'm genuinely useful for certain coding tasks like explaining unfamiliar code, generating boilerplate, or helping debug specific problems. I've helped people learn new programming languages faster than they might have otherwise. But I'm also clearly not at the level where I could replace an experienced software engineer working on complex system design or novel problem solving. ...
[1] https://claude.ai/share/c934e67c-f132-43e8-8c31-ab238b5be25a
Thank you for sharing. I'm looking forward to getting voice mode, but after doing some searching I saw that it still needed the user to hit the "send" button after they are done talking. Did I misread, or was your experience that it would reply to you after you finished talking without the need to press anything?
When you’re done talking, you tap anywhere on the screen to indicate that you’re finished talking, and Claude replies a second or two later. If you are silent for several seconds, it goes ahead and replies without waiting for you to tap. When Claude is talking you can tap on the screen to interrupt; that is actually quite useful, as it can be a bit long-winded.
It’s not quite as natural a back-and-forth as with ChatGPT’s or Gemini’s voice mode on mobile, but maybe that’s for the best.
I tried a test later in which I gave it a research paper—Sakana AI’s latest—and asked it to explain it to me. When it used a term I didn’t understand, I just tapped on the screen and asked for an explanation. That worked quite well.
1 reply →
My AI enamored friends are all nuts.
I get it. You're being so productive. (Or so you claim. I don't believe you.)
But if I don't affirm your work choices, or embrace them myself, you lose your mind like a little caffeinated squirrel.
> If you can’t metabolize the boring, repetitive code an LLM generates: skills issue!
Case in point. Touch grass.
Slop adding implicit arbitrary higher order tensors to linear problems.
Sounds cool, but not useful... lol
Reminds one of a classic fictional conversation... =3
https://www.youtube.com/watch?v=X6WHBO_Qc-Q
> Level Setting
> First, we need to get on the same page. If you're [6 months out of date with the technology you think of as overhyped after having observed it for the last 4 years or so], you're [not fairly evaluating it].
Perhaps, but this is an extremely unconvincing approach to the argument right out of the gate - it's basically on the same level as "communism has never been tried".
Especially if you're going to start out by describing your interlocutors as "crazy", rather than positing that an effectively completely new thing now exists that ought to be evaluated differently from Copilot et. al.
Why would anyone rather read and fix someone else code rather than writing the code themselves? I do a lot of code review for other human code and it use so much more energy than writing my own code (and surely, as I have competent colleagues, this is not even as bad as if I expected that the code that I'm reading could be totally random shit)
> Why would anyone rather read and fix someone else code rather than writing the code themselves?
Because their own review standards are low (so they find reviewing "easy"), and/or because they can't appreciate the emotional & mental fulfillment that coding provides.
> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
I think it comes all down to that, do you have pride in what you do or you don’t ?
I make a wall with bricks, even if it will be covered with coating i will do my best to have regular joints and pacing.
Could make it faster, no one would notice the difference but me… i hate that feeling when you done something and you know it’s barely enough, just barely, it’s kind of shit and you really don’t want others to see it.
On the opposite side, some people will take pride in building wall twice as fast as me and won’t care it’s horrendous.
Both cases are valid, but me i know i can’t do a work I’m not proud of.
Totally agree with you that pride is important, but there's definitely more. Being a good engineer is frequently a problem of design --- whether it's user experience or code abstractions. Design is partly art, and that makes us artisans at least some of the time. The code that I've written that has endured the longest --- decades --- has been code that was designed well.
> I make a wall with bricks, even if it will be covered with coating i will do my best to have regular joints and pacing.
Absolutely. This is at the core of it.
I'd be interested in the open source repositories that the author of this article created a) before "AI" and b) after "AI". I mean significant work mostly done by him.
I wasn't familiar with these agents yet. Does anyone know what the best ones are? Is there a good comparison of the options and their strengths and weaknesses?
The most salient part of the article IMO:
Under the heading: "but you have no idea what the [LLM's] code is."
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code.
> People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. [...] The LLM might be stochastic. But the LLM doesn’t matter. What matters is whether you can make sense of the result
> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates [...] how are you handling the chaos human developers turn out on a deadline?
> You’ll only notice this happening if you watch the chain of thought log your agent generates. Don’t.
"You're nuts!" says the guy with his head intentionally buried in the sand. Also way to tell me your business model is a joke without telling me your business model is a joke. Enjoy it while it lasts.
Man the redbull is oozing off this post, talk about sipping rocket fuel.
I mean a tool is a tool, nothing wrong with that - but most of the resistence stems from AI being shoved down our throats at warp speed. Its already everywhere and I can't opt out, that stinks.
As for the skepticism in terms of adoption and usefulness, its mainly a question of whether or not it will continue improving - there's no way to no what lies ahead, but if it came to a grinding halt today well then the high water mark just isn't all that impressive.
> Yeah, we get it. You don’t believe in IPR. Then shut the fuck up about IPR. Reap the whirlwind.
This is the point that matters, and I don't think everyone is on the same page that LLMs are essentially over glorified data laundering.
The industry would get just as much "value" if we declared a jubilee and wiped out all licenses and allowed unlimited plagiarism (Looking at Zuckerburg and his 10 TB of pirated data). In fact, if AI owners published their training data sets with a capable search engine, I would bet money of it out performing LLMs in most cases. Why waste all that man power reinventing Netflix again? Just copy paste the code and give everyone their time back, sheesh.
> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.
I'm in a role that is behind the times, using a bespoke in-house framework that is immune to the benefits of LLMs, so I don't get to see what you see - so as a skeptic, I'm not convinced this isn't just the illusion of speed. I have not seen convincing results, show me the amazing things being made by AI (AI tooling itself does not count) - but yes, maybe that's because its all siloed into walled gardens.
> But something real is happening. My smartest friends are blowing it off. Maybe I persuade you. Probably I don’t. But we need to be done making space for bad arguments.
Yeah all the arguments have been made, good and bad, we're all waiting to see how it plays out. But I'd rather take the side of being a skeptic - if I'm right then I'm in the right place. If I'm wrong, that's cool too, I don't mind playing catch-up. But fully embracing the hype is, IMO, tantamount to putting all your eggs in one basket, seems like a needless risk but if that's worth it to you to get ahead then by all means, slurp up the hype.
I am just some shmoe, but I believe that devs fall into to major categories when it comes to LLMs: those with their own product ideas, and those without their own product ideas.
The prior look upon Claude Code/Cursor/Windsurf much more favorably, as they are able to ship their ideas much faster.
This is a bit of a hot take, so I would love any replies to bring me back down to earth.
This is a great article and I think I agree with most of it, not being an agent user myself.
One point the author might be missing is:
> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing
The tech giants and AI bros alike have been bombarding us with "but look what it can do NOW", "the new shiny model is revolutionary, now buy tokens", "look what's around the corner" and whatever else for over 2 years now. Can you really blame a typical "skeptic" for not being interested in putting their time to veryfing these claims themselves after seeing mediocre results so many times? You start to develop a filter for these things.
I feel like we get one of these articles that addresses valid AI criticisms with poor arguments every week and at this point I’m ready to write a boilerplate response because I already know what they’re going to say.
Interns don’t cost 20 bucks a month but training users in the specifics of your org is important.
Knowing what is important or pointless comes with understanding the skill set.
I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.
The criticisms I hear are almost always gotchas, and when confronted with the benchmarks they either don’t actually know how they are built or don’t want to contribute to them. They just want to complain or seem like a contrarian from what I can tell.
Are LLMs perfect? Absolutely not. Do we have metrics to tell us how good they are? Yes
I’ve found very few critics that actually understand ML on a deep level. For instance Gary Marcus didn’t know what a test train split was. Unfortunately, rage bait like this makes money
Models are absolutely not improving linearly. They improve logarithmically with size, and we've already just about hit the limits of compute without becoming totally unreasonable from a space/money/power/etc standpoint.
We can use little tricks here and there to try to make them better, but fundamentally they're about as good as they're ever going to get. And none of their shortcomings are growing pains - they're fundamental to the way an LLM operates.
4 replies →
"pretty much every metric we have shows basically linear improvement of these models over time."
They're also trained on random data scraped off the Internet which might include benchmarks, code that looks like them, and AI articles with things like chain of thought. There's been some effort to filter obvious benchmarks but is that enough? I cant know if the AI's are getting smarter on their own or more cheat sheets are in the training data.
Just brainstorming, one thing I came up with is training them on datasets from before the benchmarks or much AI-generated material existed. Keep testing algorithmic improvements on that in addition to models trained on up to date data. That might be a more accurate assessment.
2 replies →
>I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.
Wait, what kind of metric are you talking about? When I did my masters in 2023 SOTA models where trying to push the boundaries by minuscule amounts. And sometimes blatantly changing the way they measure "success" to beat the previous SOTA
5 replies →
PLEASE write your response. We'll publish it on the Fly.io blog. Unedited. If you want.
I’m uninterested in giving you content. In particular because of your past behavior.
Thanks for the offer though.
7 replies →
Maybe make a video of how you're vibecoding a valuable project in an existing codebase, and how agents are saving you time by running your tools in a loop.
5 replies →
> with poor arguments every week
This roughly matches my experience too, but I don't think it applies to this one. It has a few novel things that were new ideas to me and I'm glad I read it.
> I’m ready to write a boilerplate response because I already know what they’re going to say
If you have one that addresses what this one talks about I'd be interested in reading it.
>> with poor arguments every week
>This roughly matches my experience too, but I don't think it applies to this one.
I'm not so sure. The argument that any good programming language would inherently eliminate the concern for hallucinations seems like a pretty weak argument to me.
6 replies →
There's also the reverse genre: valid criticism of absolutely strawman arguments that nobody makes.
Which of the arguments in this post hasn't occurred on HN in the past month or so?
2 replies →
Can you direct me somewhere with superior counterarguments? I'm quite curious
What valid AI criticisms? Most criticisms of AI are not very deep nor founded in complexity theoretic arguments, whereas Yann LeCun himself gave an excellent 1 slide explanation of the limits of LLMs. Most AI criticisms are low quality arguments.
“Valid” criticism rarely come from the people barely capable of understanding the difference between AI and LLMs, and using them interchangeably.
Tbf the dismissal of the IP argument is a bit disappointing. Just because you don't respect IP doesn't mean we all do. Some of the code I'm proudest of I'm considering never open sourcing now because I simply don't trust online repositories anymore. Granted I'm in a bit of a different field than most developers but still, open source as a concept seems less attractive in the wake of tools that scrape it and copy it automatically without attribution. Count me out from ever sharing my code ever again.
I used to believe in the stereotypical copyleft anti IP ethos. In the modern day of corporate LLMs and disrespectful scraping, the lay of the ground is different and people understandably should be less inclined to open source their code.
This is also one of my predictions. LLM scrapers have made me add the "no derivatives" descriptor to the default CC license I publish all of my writing and side-projects under (for now, CC considers ND a valid legal defense against LLM scraping).
I still do stuff in the hopes of it benefitting anyone - but not parasitic oligarchs who are responsible for the imminent destruction of the world and society.
I don't know if a reasonable number of spare-time creators, authors, artists etc. feel similar about these things, but if they do, I'd expect a lot more content under non-free licenses in the future, that might've been published openly otherwise.
I think people fail to realize that to value software to $0 is to destroy the point of most of it.
If it becomes so easy to write apps, then everyone can build their own. They won't need your app.
The entire point of most SaaS business revolves around the simple concept/question: build vs. buy.
So if there's no point in buying since building is so easy, then AI isn't helping you like your software product/SaaS.
I think programmers may just become the "IT person" or "webmaster" or "AI guy" at a company that needs software. Not at a separate company building it for others.
This may greatly reduce the need for programmers. This will in turn reduce the need for fancy computers like that expensive MacBook Pro. It will reduce the need for web hosting. Don't get me wrong, these things will still exist...but the demand will decrease.
Essentially it will become the biggest bubble since the dot com bust. It'll be much larger.
The dust will eventually settle and we'll all adapt to a new world with AI...but I think the years and years of over specialization could come to a screeching halt.
The industry will likely implode and suck in all of the orbiting industries along with it. I hope I'm wrong, but I don't think people quite understand here. I greatly suspect there will be far fewer sales of high end computers. Compute power will continue to move to data centers.
Think about it this way. Code aside, because it's not just a coding thing. If everyone can create amazing videos and full on films with AI, then they don't need all those high end machines. They don't need tons of hard drive space that takes all the raw footage and different cuts to make a video. You prompt and the get the exact result you need. Assuming AI gets that good.
So the more we push into AI the greater than "gravity" here so to speak and the stronger the implosion. I just see an entire industry collapsing in on itself personally. If I'm believing all the FOMO logic out there.
If "we'll see our first one person billion dollar company." Whoever said that is a moron. You understand what that means, right? If anyone can do all that, then how could that even exist? Are we all going to become billionaires now? No, to say something as stupid as that is to say software is trivial and worthless. So how could anyone start a software business worth so much?
No.
The issue is maintenance and focus. Sure, smash out an AI accounting app, payroll app, CRM, etc but now you're not in the business of selling hamburgers or whatever, you've become a conglomerate.
> "but you have no idea what the code is"
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.
This identifies a critical lever: it doesn't matter that AI sometimes makes mistakes, hallucinates, makes poor architecture decisions. All that matters is your PR to master, because that's what you always have been and will continue to be judged on. Merging trash? That's on you! Failing to utilise new tools and taking 10x longer to make the feature? Also on you!
Ah yes, because it was garbage six months ago, all that was thrown away and now they latched on to the "agentic" thing. At best this is pre-alpha-quality software not fit for production use. After all, it has been less than six months. Next it's going to be the post-agentic world, and then all the companies are going to go bankrupt once the bills are actually due. Any company that pivoted to video, err, AI is also going to go bankrupt, while everyone they laid off is going to build the next big thing and they'll live the heads of the AI charlatans on pikes outside of town to warn off any AI stragglers.
> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania
It's not that it's a fad. It's that the hype has gotten way ahead of the capability. CEOs laying off double digit percentages of their workforce because they believe that in 6 months AI will actually be able to do all those jobs and they want to get the message out to Wall St to juice the stock price today.
Both things can be true, and in my view, they are. I think there is a lot of "there" there with these tools, and increasingly so, and also that lots of people are out over their skis with the hype.
The key is to learn the useful tools and techniques while remaining realistic and open-eyed about their limitations.
That's why I said AI is not a fad. It's everywhere and is providing real utility. It's just not going to let CEOs replace the majority of their workforce like they think it will.
1 reply →
things are way beyond hype at this point and calling skeptics who raised the alarms along the way nuts is nuts. I'll also leave this here: https://www.youtube.com/watch?v=6ovuMoW2EGk
ai would be great if only the problems I was trying to solve were the right ones
I’d love to know your tool stack?
people do not believe AI is fad because of NFT mania. we believe LLMs are a fad because they're over-hyped and they under-deliver. just my $0.02. note that i didn't say "AI is Bullshit." but I will say some of the AI hype is bullshit. and one thing i've noticed is there are so many caveats for making accurate statements about LLMs, it's hard to make short statements that aren't wrong (unless you have all those exceptions and caveats.)
come at me.
> A heartfelt provocation
okay then (profanity begets profanity)
> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue!
Do you /intend/ to spend /most/ of your work life reading AI puke? If so: sanity issue! You are advocating for replacing creative action that gives you Flow with reaction that gives you hemorrhoids, as the /bulk of the work/. Fuck right off.
> How are you handling the chaos human developers turn out on a deadline?
By correcting the deadline setting methodology for the next round of development.
> Does an intern cost $20/month? Because that’s what Cursor.ai costs
then rely on Cursor.ai to raise your next-gen seniors who are going to be supposed to review AI puke in the future
> but the craft [...] Do it on your own time
Go choke on a bag of dicks. You are the problem. This is why quality no longer exists in the world, only lowest common denominator consumerist shit. This is why cars, home appliances, and tools have been getting both worse and more user-hostile. Because nobody values the quality of their own creative output anymore. Quality should not be an exception (which makes it very expensive), but the standard. Quality should be a near-commodity, and therefore at least occasionally affordable for anyone.
> I buy a fucking table
A ridiculously low quality, but hey, cheap!, table, that will fail you at the most inopportune moment. IKEA does have its place, but it's probably not in your office where you receive clients. Do you want them to see the table top fly off?
> They almost certainly have a bigger bag of algorithmic tricks than you do: radix tries, topological sorts, graph reductions, and LDPC codes
and you'll have to review that bag of algorithmic tricks embedded in AI puke, with potential hallucinations introducing obscure, critical mathematical bugs. Good luck verifying all that as a senior, based on proofs and first principles. A human coder will either use a well-reviewed library, or cite a textbook, and implement the textbook very closely, so you can follow it, and have some trust in the intent at least.
> No profession has demonstrated more contempt for intellectual property
how disingenuous. you are showing contempt for copyright. You are totally missing the point of copyleft. Copyleft doesn't advocate for anarchy, it advocates for user freedom via the software copyright framework. When AI barfs GPL'd code snippets into a proprietary code base, that harms user freedom.
> and watched it in seconds spot LVM metadata corruption issues
you are drifting; this is not code generation. Fuzzy pattern matching has always been fair game.
>> but you have no idea what the code is
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?
> You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.
Loved this bit. In a professional setting, any code — AI-written or not — should be read by at least two people, the submitter and the reviewer.
Another day another post about AI by people whose job will be made redundant.
Something fascinating about this HN crowd in particular is its lack of foresight. Companies producing medicine have to take into account longterm effects, yet people working in these “tech” companies push out AI after AI with not a single care in how it’s going to impact people.
The more time I spend using AI-assisted programming tools the less worried I am that they're going to take my job.
Not because they aren't effective - they are - but because using them helps me appreciate how important my own skills and experience are for using them to effectively solve problems.
interesting are LLMs a lot better at Go than Rust?
> We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.
Very true and humbling.
[dead]
I'm not a skeptic because they can't produce code or whatever. Sure they are getting better at it.
I'm a skeptic because the megacorps have decide ai slop is the future and its going to push us deeper into hypercapitalist dystopia.
Programming as a career is going to end up as a ai slop janny. You aren't going to be paid more if your job is easier or whatever, you're gonna get paid less (and then end up doing more atomised work) because "an ai can do it". That's the direction that everything seems to be heading.
I'm more interested than skeptical, but I'd take 500 simonw posts over another one of these "shove this concern up your ass" ones.
I find it curious that nobody in this whole thread does once mention AI communism. I think after much pain and sorrow, and many deaths, this is the most likely outcome. Probably in a dystopian fashion.
> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
> The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.
This is pretty disingenuous and mean-spirited. He's using SV startup types to smear all developers, including the typical, humble, un- or undercompensated FOSS dev who absolutely respects intellectual property, and quite reasonably, expects their own IP to be respected as well. But because Sam Altman types follow the "It's Better to Beg for Forgiveness than to Ask for Permission" rule, altruistic, non-sociopathic developers have no right to complain?
> non-sociopathic developers have no right to complain
The very same developers that advocate pirating from Netflix or Disney?
He's pointing out the hypocrisy in a class of people (generally, tech-savvy programmers) that have no problem advocating for piracy and gleefully thwarting the concrete IP rights of businesses on one hand, while defending and bemoaning the abstract IP rights of other content creators on the other hand.
My take away is that he's saying IP rights is a weak argument to use to defend a personal position for not adopting LLM's in your workflow, especially given the context of modern piracy.
> The very same developers that advocate pirating from Netflix or Disney?
Even the ones advocating or engaging in personal piracy aren't creating unauthorized derivative works and monetizing them, which is arguably what these AI systems and their customers are doing. And Thomas wasn't talking about personal piracy, but instead trying to use the IP-hostile conduct of Google and certain startups (like Scribd or GrooveShark) to smear all developers and de-legitimize their IP-based objections, even of those releasing code under permissive licenses (like the GPL) in exchange for expected adherence to certain terms and conditions that these systems and their users ignore.
3 replies →
All good. They will suffer, and you will laugh.
>My AI skeptic friends are all nuts
This is dead right; my AI skeptic friend are ideologically opposed to LLMs and GenAI.
Their arguments have nothing to do with how good or bad LLMs are:
- IP issues (for code in a jupyter notebook which will never face IP scrutiny)
- because it exploits slaves and poor people (but so does the computer they use and the clothes they wear).
I also have a few programmer friends who make more legitimate complaints about code quality but because they are friends there is also an obvious subtext of ideological opposition.
My opinion is you should use LLMs anyway; you can hate "capitalism" and still take advantage of the system.
>All progress on LLMs could halt today, and LLMs would remain the 2nd most important thing to happen over the course of my career.
During my time as a programmer (gaming + astrophysics), the following things have happened:
1. The rise of GPGPU programming, which has enabled 1000x increases in performance
2. A revolution in CPU architecture, from single core, to massively multicore
3. C++98 - C++26
4. Transformational increases in CPU memory
5. (In the graphics space): dedicated raytracing hardware, the fully general purpose GPU + OpenCL/CUDA, deferred rendering, FSR/DLSS
6. Mobile phones were invented
7. 64-bit
8. The internet went from being for MSN and facebook to what it is today
9. The rise of Valve/Steam and the incredible transformational existence of the indie games market, which enables individuals to self publish their code and products for virtually nothing
10. Clang was born, and GCC came back to life. MSVC became standards compliant, finally
11. Rust was invented, and is taking off. People actually take security seriously now (!)
12. Unified CPU/GPU memory spaces, GPU architectures becoming scalar etc
All of these have had vastly more impact on my programming than LLM's ever had. I've always a bit wondered who people are who find it transformational, because I can vaguely gesture at any part of my programming tooling and find that its vastly different to 10 years ago
>interact with Git, run existing tooling, like linters, formatters, and model checkers, and make essentially arbitrary tool calls (that you set up) through MCP.
I mean.. you can just do all these things anyway. Its literally more work for me to use an LLM to run a linter than to run a linter (its built into my IDE). So's git. And formatters as well. You can also make arbitrary tool calls.. through your ide/scripts/precommit/postcommit/etc. I have no idea why you'd even want an LLM to do this!
>People complain about LLM-generated code being “probabilistic”. No it isn’t
>The LLM might be stochastic
????????? That's literally the entire point. I want deterministic answers, like clang based autocomplete instead of the nightmare that is prob- i mean stochastic autocomplete
>If hallucination matters to you, your programming language has let you down.
They matter beacuse they waste your time with bullshit that you then have to fix. No programming language can truly enforce correct logic constraints which are one of the primary difficulties with writing code. I literally have no idea what the OP is even doing, actually writing code has always been a vastly minimal amount of time - the main bottleneck is the train of thought to make sure that everything's correct. The literal typing is an afterthought. No programming language can bypass that step, they can (at best) handhold you through certain kinds of problems
>Does an intern cost $20/month? Because that’s what Cursor.ai costs.
For the moment. They'll jack up the prices while enshittifying, and then good luck. I do not want to subscribe to a service to be able to code. I use free IDEs and free tools for exactly this reason. If you rely on a paid-for 3rd party you are doing it wrong and will regret it. This is one of the silliest things in the whole article
>But people select languages in part based on how well LLMs work with them, so Rust people should get on that
They really don't. People select languages based on:
1. What they're taught at school/university
2. What language meets their domain requirements
3. What language their job requires
>Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.
Are you joking. Is this article a troll? Programmers give a lot of a crap about copyright law. GPL/MIT/etc are hugely important, and people respect the ever loving crap about it in general - even most major companies won't just blatantly pirate GPL works
There's a tonne more in this article, but it smells like someone who has literally no idea what they're talking about talking out of their arse, and it shows profoundly
I code to understand. That's it. That's all. My code is a reflection of a problem definition and a solution. The problem may be under- or poorly defined and the solution may be lousy but by God it's there as a clear reference to be further reasoned about.
Programmers who don't work to deepen their understanding? Nah, sorry, not programmers I want anywhere near the code I'm working on.
To the extent that an LLM can help deepen understanding, good! Use it! Use its code. But you use it according to your intentional, systematic understanding of what the you're doing and why. And that understanding is not a function of code generation or, worse, code churn for the sake of feeling like you're going fast and paying less.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.
I've been writing code for over 30 years and I can't recall ever writing "tedious code". I don't know what this even means. Code, by definition, should not be tedious. At any layer of the stack, code should solve non-tedious problems. The whole point of code is to avoid tedium.
Most code, most projects? I haven't seen this and doubt it's even close to true - if it is God help us. If you write tedious code, find another place to work or another career.
> Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work.
Seriously who kills time by refactoring unit tests? Is this projection by the author?
> But an LLM can be told to go refactor all your unit tests.
Wtaf?? The unit tests are what stands between the blind ambitions of a statistical model and correct code. Handing over tests to an LLM? One staggers.
Tests represent part of that problem definition and solution. They're integral to the code. They foster and deepen understanding - not to be subject to the musings of statistical process, unless those musings serve as a dialectic.
> If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?
Answer: Slow the F down! Chaos and programming don't belong. Managers who drive their teams like this are bad managers.
> If hallucination matters to you, your programming language has let you down.
The author seems to think that programming is about producing code. Producing code that compiles is not a problem. Producing correct and coherent solutions to problems that need to be solved is a problem. Hallucinations aside, how will you know that generated code is something to keep or reject unless you have some understanding of what the code should do before you see it?
> Does an intern cost $20/month? Because that’s what Cursor.ai costs.
Squee! Code for all my friends!
> but it’s bad at rust
No, it's bad at understanding. And so are humans, initially. But it's our job to build and deepen understanding and we use code to facilitate that.
> Professional software developers are in the business of solving practical problems for people with code.
Correct.
> [LLMs] devour schlep, and clear a path to the important stuff, where your judgement and values really matter.
This begs the question, who defines what's important? It sounds like the author agrees that it's us, not the model. I agree this is a useful method. It's like any code review - it sharpens the reader no matter the code quality.
> As a mid-late career coder, I’ve come to appreciate mediocrity.
How cool and nonchalant what a rebel he says the opposite of everyone what a baddie!
I think the author means "appreciate naivety", meaning, a first pass that's good enough is good enough - there's no need to go further. I'll concede this, but it's a concession and not something to boast about. Odds are decent you'll come back to this naive implementation and revise it. Yes, at a certain point there's diminishing returns but good programmers know when to invest in revision and when to move on.
> LLMs really might displace many software developers.
We've seen this cycle. Technology and globalization puts pressure on programmers. There are legions of programmers in parts of the world that will generate code for $20 an hour. You can pay $20 and hour - or $20 a month - and generate mountains of code that blows chunks - not because it's ugly but because it's incoherent or because it solves the wrong problem. We. have. seen. this. It's why Silicon Valley still pays massive salaries to good programmers.
> The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality.
Time will tell. In the meantime I'll keep investing in understanding, not code, and see how things shake out.
> It’s projection. People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough! But people select languages in part based on how well LLMs work with them, so Rust people should get on that.
How is it the responsibility of the Rust community that there weren't enough metric tons of free code for the machine to slurp up? And the phrasing makes it sound like it's the community's fault for not feeding OpenAI enough code to be stripped of its license and authorship and get blended into a fine latent soup. It's a lot like people coming to a one-man FOSS project with a laundry list of demands, expecting to be treated with the religious reverence of a major enterprise contract.
The whole tone, the pervasive "use it or you'll be left behind"—where users saying they don't want or need it only proves further evidence of its imminent apotheosis—superficially reminds me of previous FUDs.
And how is it not concerning that the thing described as intelligent needs billions of lines to generalize a language a human can learn from a single manual? Will it need hundreds of kLOC to internalize a new library, or even its new version, beyond in-context learning? The answer is yes; you are choosing to freeze the entire tech stack, when fixing its abstractions could actually save you from boilerplate, just so the machine can write it for you at $200 a month with a significant error rate.
Is the problem with LLMs and Rust really a problem with the training data or a problem with it being harder to get something to compile in Rust? I presume its easier to get something to compile in the dynamically typed languages, and then slightly more difficult in statically typed languages and then a step up with Rust which also forces you to deal with the lifetime of objects. But maybe with LLMs its different and they can do static typing easier than dynamic but they just really struggle with structuring a program to properly deal with ownership.
They mostly copy and mix patterns in the training data. Lots of repetition with variations on them is helpful for their generalization. Languages like Python and Go have a ton of code in Github, etc like that. I saw that using Python with GPT 3.5/4.
If it's a rarer language, the math doesn't do as good of a job on piles of random code. There's just not enough for it to learn from. I cant speak for Rust since I dont know the numbers but imagine it's much less than Python or Go.
I have seen some evidence, though, that harder languages are harder for them to code in. GPT 3.5 used to struggle with C++ for something that it could easily produce in Python. It could actually produce things in C more easily than C++. It makes sense, though, because there's both more context needed for correctness and more behavioral patterns to write it.
My solution, which I only prototyped in GPT due to leaving AI, was to use AI's to write code in languages like Python which non-AI tools transpiled to high-performance code in C++ or Rust. Think the Python to C++ compiler or maybe Nikita. Later, with hallucinations mitigated enough, add LLM's to those transpilers.
As a side benefit, it let's you sell a product accelerating or increasing predictability of applications in that language. That's a non-AI investment. There's a few companies doing that, too. So, one could sell to the AI crowd, the "language X in business" crowd, or both.
Wow, I'm surprised this post hasn't gotten hit by the HN flame-war detector: We have threads downvoted into forced-collapsed that are headed by completely innocuous comments. It really is strange how people here are not able to accept different viewpoints in a discussion: I've three times now returned to the comments to just blanket-upvote everything that's starting to turn grey.
This is satire, right?
...Right?
Thankfully the uncrazy person is going to get us on that sane VC AI wavelength.
> If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other.
They’re playing 3D chess while you’re stuck at checkers.
I do things suboptimally while learning the ropes or just doing things casually. That doesn’t mean that I judge the approach itself by my sloppy workflow. I’m able to make inferences about what a serious/experienced person would do. And it wouldn’t involve pasting things through three windows like I would do.
So of course I don’t judge AI by “ask chatty and paste the response”.
Yes indeed: “deploying agents” is what I would imagine the Ask Chatty And Paste workflow taken to Perfection to look like.
> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.
Most Rube Goldberg machines are very tedious and consist of fifty-too-many parts. But we can automate most of that for you—
I could not have ever imagined a more Flintstones meets Science Fiction clash than AI According To Software Engineers. You’re using AI to generate code. And no one cares how much. It’s just so tedious in any case.
A wortwhile approach would have been to aspire to make or generate technology artifacts that could be hidden behind a black box surface with a legible interface in front. Is the code tedious? Then make the AI come up with something that is well-designed, where the obvious things you want is given freely, where minor customizations are just minor tweaks, and larger deviations require only proportionally larger changes. Uh, how about no? How about generating 20KLOC line “starter” some-framework project with all the 20KLOC “tedious” bits hanging out, then we can iterate from there. The AI made a Git log and everything so it’s ya know audited.
But maybe I’m being unfair. Maybe we are moving towards something not quite as stupid as Deploy ChatGPT 50X? Or maybe it’s effectively going to behind a black box. Because ya know the AI will deal with it all by itself?
> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?
> You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.
No!, and what the fuck is wrong with you? We are Flintstone technologists and I’ll be damned if I can’t get my AI brain chip-injected, genetically enhanced for speed horsey cyborg for my modern horse-drawn carriage patent.
well, I'm going to reference [1] a comment on the "Who Cares Era" that really stood out and maybe should get pinned somewhere.
What about that "superpower" we had to make something out of _nothing_ but our own brains? Sure, you may have looked up some code (do some research and learn), but that mental exercise.. just knowing your code inside-out, can never be compared to the "oh look at that, it actually works" with LLM stuff.
Well guess what, now, you traded that superpower for a subscription! (yes, you are now a middle man of software peddling - paying some company that stole other peoples code and ideas and did not pay _them anything_ in return), and eventually you will get to a point when the internet is down, you won't write any software that day cause you cant get the N model to work.
Did we go to the next step in software evolution? If it has a cost per month (and you pay thieves) (yes that what they are), its going to be a hard "no" for me, you just became a cow. Sure, now this is whats expected of you? OK. This has some strong neoliberalism drift, I assume I can be wrong so please comments welcome I am quite interested in this part.
yes, augmentations seem like amputations, is it McLuhan? I'm not sure, Will we retain our power to produce software? No! Humans dumb down very quickly, a generation is all it took to reduce reading to kids below expected educational levels [2]. There will be less professionals and lots of peddlers that have no skills, and software will be so junk it will be seriously scary to catch a plane in 10 years time.
Interesting times.
[1] https://news.ycombinator.com/item?id=44116724 [2] https://www.theguardian.com/lifeandstyle/2025/jun/02/gen-z-p...
Something that I feel is missed in every such post or debate is that, maybe, this isn’t a yes/no question.
Fast forward 10 years, the skeptics will be right on a few things, and the fanboys will be right on other things.
I for one know this isn’t just a “fad” like NFTs were, but I cannot help but feel skeptical about the sweeping claims of software engineering’s doomsday.
A hammer hammers.
It hammers 100% of the time, with no failure.
It requires the same amount of labour from my part but it delivers the same outcome every time.
That is what tools do, they act as an extension and allow you to do things not easily done otherwise.
If the hammer sometimes hammers, sometimes squeaks and sometimes screws then it requires extra labour from my part just to make it do what purpose specific tools do, and that is where frustrations arise.
Make it do one thing excellent and we talk then.
This is the kind of non-serious argument he's talking about. There are plenty of tools that require supervision to get good results. That doesn't make them useless.
My 3D printer sometimes prints and sometimes makes spaghetti. Still useful.
They never said it was useless. You just invented that straw man in your head.
3D printing is largely used for prototyping where its lossy output is fine. But using it for production use cases requires fine tuning it can be 99.9% reliable. Unfortunately we can't do that for LLMs hence why it's still only suitable for prototyping.
2 replies →
There is a big difference between "not entirely useless" and best tool for the job.
You don't use 3D printing to do large-scale production. If you agree that AI should only be used in prototype code and nothing else, then your argument makes sense.
1 reply →
IMO this is exactly the wrong mental model.
You can't hammer a nail a 1000 times and pick the best hammered nail.
You can have the hammer iterating over the structure 24/7, finding imperfections in previous hammered nails.
This is imposing arbitrary constraints on the model and that when you give a human just a hammer, they tend to start to view everything like a nail.
You know what's nuts? How so many articles about supporting LLMs and against skeptics are so full of fallacies and logical inconsistencies like strawmans, false dichotomies, appeals to emotion and to authority when they have supposedly almost AGI machines to assist them in their writing. They could at least do a "please take a look at my article and see if I'm commiting any logical fallacies" prompt iteration session if they trust these tools so much.
These kinds of articles that heavily support LLM usage in programming seem to FOMO you or at least suggest that "you are using it wrong" in a weak way just to invalidate contrary or conservative opinions out of the discussion. These are pure rhetorics with such an empty discourse.
I use these tools everyday and every hour in strange loops (between at least Cursor, ChatGPT and now Gemini) because I do see some value in them, even if only to simulate a peer or rubber duck to discuss ideas with. They are extremely useful to me due to my ADHD and because they actually support me through my executive disfunction and analysis paralysis even if they produce shitty code.
Yet I'm still an AI skeptic because I've seen enough failure modes in my daily usage. I do not know how to feel when faced with these ideas because I feel out of the false dichotomy (pay for them, use them every day, but won't think them as valuable as the average AI bro). What's funny is that I'm yet to see an article that actually shows LLMs strengths and weaknesses in a serious manner and with actual examples. If you are going to defend a position, do it seriously ffs.
Just for fun, I asked ChatGPT and it came up with 30+ fallacies + examples. I'm sure some is hallucinated, but woof:
https://chatgpt.com/share/683e62ed-e118-800f-a404-bd49bec799...
As I said in another post. The article is pure rhetoric. It provides no actual numbers, measurements, or examples.
It's just "AI did stuff really good for me" as the proof that AI works
Check, please.
I tried the agent thing on:
- Large C codebase (new feature and bugfix)
- Small rust codebase (new feature)
- Brand new greenfield frontend for an in-spec and documented openAPI API
- Small fixes to an existing frontend
It failed _dramatically_ in all cases. Maybe I'm using this thing wrong but it is devin-level fail. Gets diffs wrong. Passes phantom arguments to tools. Screws up basic features. Pulls in hundreds of line changes on unrelated files to refactor. Refactors again and again, over itself, partially, so that the uncompleted boneyard of an old refactor sits in the codebase like a skeleton (those tokens are also sent up to the model).
It genuinely makes an insane, horrible, spaghetti MESS of the codebase. Any codebase. I expected it to be good at svelte and solidJS since those are popular javascript frameworks with lots of training data. Nope, it's bad. This was a few days ago, Claude 4. Seriously, seriously people what am I missing here with this agents thing. They are such gluttonous eaters of tokens that I'm beginning to think these agent posts are paid advertising.
It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.
An interesting thing about many of these types of posts is they never actually detail the tools they use and how they use them to achieve their results. It shouldn’t even be that hard for them to do, they could just have their agent do it for them.
>> It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.
You may be right. The author of this one even says if you spend time prettying your code you should stop yak shaving. They apparently don't care about code quality.
1 reply →
The agent/model being used makes a huge difference. Cline with Claude 3.7 is ridiculously expensive but useful. Copilot is vaguely ok.
1 reply →
You’re not providing a key piece of information to provide you with an answer: what were the prompts you used? You can share your sessions via URL.
A prompt like “Write a $x program that does $y” is generally going to produce some pretty poor code. You generally want to include a lot of details and desires in your prompt. And include something like “Ask clarifying questions until you can provide a good solution”.
A lot of the people who complain about poor code generation use poor prompting.
It'd be nice if the AI advocates shared prompts, or even recorded entire sessions. Then we could all see how great it really is.
4 replies →
Prompt engineering isn't really that important anymore imo. If you're using a reasoning model, you can see if it understood your request by reading the reasoning trace.
4 replies →
There a many ways to do something wrong and few ways to do them right. It's on the AI advocates to show us session logs so we can all see how it's done right.
How are you writing your prompts? I usually break a feature down to smaller task level before I prompt an agent (claude code in my case) to do anything. Feature level is often too hard to prompt and specify in enough detail for it to get right.
So I'd say claude 4 agents today are at smart but fresh intern level of autonomy. You still have to do the high level planning and task break down, but it can execute on tasks (say requiring 10 - 200 lines of code excluding tests). Any asking it to write much more code (200+ lines) often require a lot of follow ups and disappointment.
This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things seem bad.
1 reply →
Coding agents should take you through a questionnaire before working. Break down what you are asking for into chunks, point me to key files that are important for this change, etc etc. I feel like a bit of extra prompting would help a lot of people get much better results rather than expecting people to know the arcane art of proompting just by looking at a chat input.
3 replies →
I feel like when you prompt an LLM the LLM should take it almost as "what would the best possible prompt for this prompt be and then do that"...
I don't think it's fair to call that the agent thing. I've had profoundly positive results with agentic workflows for classification, analysis, and various business automations, including direct product pricing. You have to build an environment for the agent to make decisions in, with good instructions for what you want them to do. Then you wire it up so that the decisions have effects in the real world. You can acheieve really good results, and there is a lot of flexibility to tweak it and various tricks to optimize performance. Tools can allow agents to pull in relevant context as needed, or to execute complex multistep workflows. That is the agent thing.
Writing code is one thing that models can do when wired properly, and you can get a powerful productivity boost, but wielding the tools well is a skill of it's own, and results will vary by task, with each model having unique strengths. The most important skill is understanding the limitations.
Based on your task descriptions and the implied expectation, I'm unsurprised that you are frustrated with the results. For good results with anything requiring architecture decisions have a discussion with the model about architecture design, before diving in. Come up with a step by step plan and work through it together. Models are not like people, they know everything and nothing.
Have it make small changes. Restrict it to a single file and scope it to <50 lines or so. Enough that you can easily digest without making it a chore.
A small change scoped to <50 lines is something easy to write for a normal software engineer. When do the LLMs start doing the hard part?
4 replies →
>these agent posts are paid advertising.
I'm 100% certain most if not all of them are, there is simply too much money flying around and I've seen things that marketing does in the past for way less hyped products. Though in this specific case I think the writer may simply be shilling AI to create demand for their service. Pay us monthly to one click deploy your broken incomplete AI slop. The app doesn't work? No problem just keep prompting harder and paying us more to host/build/test/deploy it...
I've also tried the agent thing and still am with only moderate success. Cursor, Claud-squad, goose, dagger AI agents. In other words all the new hotness, all with various features claiming to solve the fact that agents don't work. Guess what? they still don't.
But hey this is HN? most of the posters are tech fearing luddies right? All the contention on here must mean our grindset is wrong and we are not prompting hard enough.
There is even one shill Ghuntly that claims you need to be "redlining" ai at the cost of $500-$1000 per day to get the full benefits. LOL if that is not a veiled advertisement I don't know what is.
Nobody has paid me to write anything about AI and if they did I would disclose it.
9 replies →
This is my experience too most of the time. Though sometimes it does work, and sometimes a solution is found that I never thought of. But most of the time I have to change things around to my liking.
However, a counter argument to all this;
Does it matter if the code is messy?
None of this matters to the users and people who only know how to vibe code.
> Does it matter if the code is messy?
It matters proportionally to the amount of time I intend to maintain it for, and the amount of maintenance expected.
Same here. I keep trying to figure out WTF agent that people are using to get these great results, because Copilot with Claude 4 and Gemini 2.5 has been a disastrous mess for me.
This has become an impoverished conversation. I have seen this pattern where llm capabilities improve and people who had previously dismissed the technology based on its present capabilities realize they were wrong in their pessimistic assessment of the potential of the tech switch over to the other side and project their own previous bias onto those who continue to object.
The basic structure is this: six months ago, I tried llms and they were trash, but holy cow they have improved so much, now I can use them to avoid tedious that I don't like! Don't be an idiot like my skeptical past self.
Then they accuse everyone who now disagrees with their take on the tech as being an incurious luddite who is blinded by their anti-progress bias.
Personally, as a non-user but close observer of the tech, I never doubted that the tech would improve but there are obvious problems with the tech beyond hallucinations that have to do with human meaning, understanding and power relations that cannot be solved by making the tech better.
My challenge to all of the booster is this: try to articulate your own personal vision of both ai utopia and ai dystopia. I personally find it borderline impossible to even imagine a utopia emerging from genai, but it is extremely easy for me to imagine dystopia, especially given the entities that are controlling the tech and competing to "win" the ai arms race.
For me, the representation of the Chinese state as filtered through western media is already a dystopia. Of course, having not been to China myself and being unable to speak any of their languages, I cannot personally verify the representation. But by competing with the Chinese on ai (and I mean Chinese in the way we define them in the west, which I recognize may be very different from both the actual lived experience in China and their self-conception), we become more like our own negative stereotypes of them. It is essentially a race to disempower ordinary people, remove agency from their lives, hand it over to agents who most certainly do not have the best interest of living humans in mind and call this victory. To "win" the ai war as presently defined would likely be disaster for us all.
There are these ridiculous handwaving things about solving climate change or even human mortality with this tech even though there is no evidence whatsoever that it will do this. Just because the people building it say it will do these things doesn't mean we should trust them.
Imagine if a primatologist tried to tell us that because they have trained chimpanzees to recognize colors and some words better than a three year old, we should now stop investing in education and direct all of out resources into training chimps to do all of our repetitive tasks for us to liberate us from the drudgery of doing anything for ourselves. With enough resources, you would see an explosion in chimp capabilities and this would come directly at the expense of humans, who now have no work to do and just sit on top of a pyramid built by chimp labor. Not only would the things we made be worse than what we could have if we focused on developing our own human capacities instead of chimp capacities, but we would live in fear that the chimps (who also are much stronger than us) will wake up to their plight and rise up against their rulers. Humans would also rapidly lose our own capabilities and become much more like chimps than the humans of today. Sound familiar?
I tend to believe that as it is currently being developed, this tech is far more likely to lead us in a direction like the chimp apocalypse than some post labor utopia.
That doesn't mean that the tech isn't getting better or can't do impressive things. I can hold both things in my head at once. But I am much more concerned with human flourishing and well being than some bored engineers who don't actually like programming (or at least don't like it under the current industrial conditions) feeling like they are being liberated from the tedium of their work. And instead of solving the real the underlying problems that make the work so tedious, we instead compound the problem by having ai generate even more of the exact kind of code that caused the problem i the first place.
I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.
Why? Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze. It will concentrate immense power and wealth in the hands of people who I don’t trust. And it will do all of this while consuming truly shocking amounts of energy.
Not only do I think these things will happen, I think the Altmans of the world would eagerly agree that they will happen. They just think it will be interesting / profitable for them. It won’t be for us.
And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI. My skepticism (and unwillingness to aid in the advancement of AI) might slow things down a billionth of a percent. Maybe if there are more of me, things will slow down enough that we can find some sort of effective safeguards on this stuff before it’s out of hand.
So I’ll keep being skeptical, until it’s over.
I'm in a nearly identical boat as you.
I'm tired. I'm tired of developers/techies not realizing their active role in creating a net negative in the world. And acting like they are powerless and blameless for it. My past self is not innocent in this; but I'm actively trying to make progress as I make a concerted effort to challenge people to think about it whenever I can.
After countless of times that the tech industry (and developers specifically) have gone from taking an interesting technical challenge that quickly require some sort of ethical or moral tradeoff which ends up absolutely shaping the fabric of society for the worse.
Creating powerful search engines to feed information to all who want it; but we'll need to violate your privacy in an irreversible way to feed the engine. Connecting the world with social media; while stealing your information and mass exposing you to malicious manipulation. Hard problems to solve without the ethical tradeoff? Sure. But every other technical challenge was also hard and solved, why can't we also focus on the social problems?
I'm tired of the word "progress" being used without a qualifier of what kind of progress and at the cost of what. Technical progress at the cost of societal regression is still seen as progress. And I'm just tired of it.
Every time that "AI skeptics" are brought up as a topic; the focus is entirely on the technical challenges. They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable. I'm skeptical if the tradeoffs being made will benefit society overall; or just a few. Because at literally every previous turn for as long as I've been alive; the impact is a net negative to the total population, without developer questioning their role in it.
I don't have an answer for how to solve this. I don't have an answer on how to stop the incoming shift in destroying countless lives. But I'd like developers to start being honest in their active role in not just accepting this new status quo; but proactively pushing us us in a regressive manner. And our power to push back on this coming wave.
+65536
But, tech was not always a net negative.
As far as I can tell, the sharpest negative inflection came around the launch of the iPhone. Facebook was kind of fine when it was limited to universities and they weren't yet doing mobile apps, algorithmic feeds or extensive A:B testing..
It seems "optimizing engagement," was a grave initial sin...
Maybe some engineers should to go back to their childhoods and watch some Outer Limits and pay attention to the missed lessons..
Our lives are not our own. From womb to tomb, we are bound to others. Past and present. And by each crime and every kindness, we birth our future.
2 replies →
So the problem is society’s lack of any coherent ethical framework that says building powerful disruptive technology shall be done like this. If you’re tired, then go fix that problem. Find the answer. Because I’m *exhausted* hearing about how everybody is supposed to risk putting food on their table by telling the big boss they won’t make the feature because it’s unclear whether it might be a net negative for society under one contorted version of an angsty ethical framework a small minority of people have ad-hoc adopted on that orange message board… and that _scares_ them.
2 replies →
The luddites get a bad rap these days, but we need more of them.
8 replies →
> net negative to the total population, without developer questioning their role in it.
I am tired of people blaming bottom developers, while CEOs get millions for "the burden of responsibility".
11 replies →
There is technology, related technical advancements and then there is this business incentives to make money. A lot of progress has indeed been made in NLP, information retrieval which is helpful in its own ways to speed up thing, it can easily be seen as next level of automation.
Everything else around it is a glamorous party cause everyones money is riding on it and one needs to appreciate it or risk being deserted by the crowd.
The basics of science is around questioning things until you get convinced. People depending on models too much may end up in a situation where they would loose the ability to triangulate information from multiple sources before being convinced about it.
Programming can be more complicated above a certain threshold even for humans so it would be interesting how the models perform with the complexity. I am skeptic but again I dont know the future either.
> They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable.
This is because most people on HN who say they are skeptical about AI mean skeptical of AI capabilities. This is usually paired with statements that AI is "hitting a wall." See e.g.
> I'm very skeptical. I see all the hype, listen to people say it's 2 more years until coding is fully automated but it's hard for me to believe seeing how the current models get stuck and have severe limitations despite a lot of impressive things it can do. [https://news.ycombinator.com/item?id=43634169]
(that was what I found with about 30 seconds of searching. I could probably find dozens of examples of this with more time)
I think software developers need to urgently think about the consequences of what you're saying, namely what happens if the capabilities that AI companies are saying are coming actually do materialize soon? What would that mean for society? Would that be good, would that be bad? Would that be catastrophic? How crazy do things get?
Or put it more bluntly, "if AI really goes crazy, what kind of future do you want to fight for?"
Pushing back on the wave because you take AI capabilities seriously is exactly what more developers should be doing. But dismissing AI as an AI skeptic who's skeptical of capabilities is a great way to cede the ground on actually shaping where things go for the better.
7 replies →
Ethical bottom for industry as a whole (there will always be niche exceptions) is typically the law. And sometimes not even that when law can't be enforced effectively or the incentives are in favor of breaking the law.
1 reply →
The current incentive is not improving humanity.
For ai companies, its to get a model which can be better on benchmarks and vibes so that it can be sota and get higher valuation for stakeholders.
For coders, they just want the shit done. Everyone wants the easy way if his objective is to complete a project but for some it is learning and they may not choose the easy way.
Why they want to do it the easy way, mostly as someone whose cousin's and brother's are in this cs field(i am still in high school), they say that if they get x money then the company at least takes a 10x value of work from them. (Of course, it may be figuratively). One must imagine why they should be the one morally bound in case ai goes bonkers.
Also, the best not using ai would probably stop it a little but the ai world moves so fast, its unpredictable, deepseek was unpredicted. I might argue that now its a matter of us vs China in this new arms race of ai. Would that stop if you stop using it? Many people are already hating ai but has that done much to stop it? If that is, you call ai stopping at the moment.
Its paradoxical. But to be Frank, LLM was created for the reason Its excelling at. Its a technological advancement and a moral degradation.
Its already affecting supply chain tbh. And to be frank, I am still using ai to build projects which I just want to experiment with and see if it can really work without getting the domain specific knowledge. Though I also want to learn more and am curious but just don't have much time in high school.
I don't think people cared about privacy and I don't think people would care about it now. And its the same as not using some big social media giant, you can't escape it. The tech giants also made it easier but less private. People chose the easier part and they would still choose the easy part ie llm. So I guess the future is bleak eh? Well the present isn't that great either. Time to just enjoy life while the world burns by the regret of its past actions for 1% shareholder profit. (For shareholders, it was all worth it though, am I right?)
My 0.02$
Unfortunately Capitalism unhindered by regulation is what we wanted, and Capitalism unhindered by regulation is what we have. We, in the western world, were in the privileged position of having a choice, and we chose individual profit over the communal good. I'm not entirely sure it could have been any other way outside of books given the fact we're essentially animals.
14 replies →
Hear! Hear!
As I implied in an earlier comment, your conviction (if you're wrong on the inevitability of the direction), may be one of the things that leads it into that direction.
Here's my historical take: in the 1960s and 1970s, computation in general was viewed as a sinister, authoritarian thing. Many people assumed it was going to be that way, and a small minority recognised that it also had the potential to empower and grant autonomy to a wider class of people. These were the advocates of the personal computer revolution -- the idea of "computer lib", whereby the tools of control would be inverted and provided to the people at large.
You can argue about whether that strategy was a success or not, but the group tht was largely irrelevant to that fight were the people who decided not to get involved, or to try (although not very hard) to impede the development of computation in general.
To bend the trajectory of AI in general involves understanding and redeploying it, rather than rejecting it. It also involves engaging. If it's anything like the last few times, the group that is simultaneously exploring and attempting to provide agency and autonomy for the maximum number of people will be smaller than both those using new tech to exploit people or maintain an unequal status quo, and the people who have good intentions, but throw their hands up at the possibility of using their skills to seize control of the means that provide for a better future.
> in the 1960s and 1970s, computation in general was viewed as a sinister, authoritarian thing.
And it was correct. We now live in surveillance states much worse than Stalin's or east germany.
Structural incentives explain the computer trajectory. While they were purely in the academic realm they were a force of empoverment, but this ended when economic incentives became the main driver. AI has speedrun the academic stage—if it ever existed—and is now speedrunning the enshittification stage.
But there is very little you or I can do about it except choosing not to partake.
5 replies →
"And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI."
I firmly believe that too. That's why I've been investing a great deal of effort in helping people understand what this stuff can and can't do and how best to make use of it.
I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.
> I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.
That feels idealistic. About as realistic as telling people how to use semiconductors or petrochemicals for good instead of bad.
No-one knows where AI is going but one thing you can be sure of - the bad actors don't give two hoots what we think, and they will act in their own interests as always. And as we see from historical events, there are still many, many bad actors around. And when the bad actors do bad things with the technology, the good actors have no choice but to react.
2 replies →
Sorry to snipe but: You don't feel at least a little shared responsibility in evangelizing "vibe-coding"? Is that currently blazing hype a force for good? I think it wouldn't be all over social- and mainstream media at this point without your blog post(s).
1 reply →
One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly based on suspect ethics (e.g., IP theft[0]).
The ends do not justify the means. It is a similar judgement as when refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed. Everyone makes one for themselves.
Coincidentally (or not), if said suspect ethics were properly addressed, it would ameliorate some of the reservations even the actual skeptics have. Licensing training data would make everyone involved aware of what is happening, give them an ability to vote and freedom to choose, soften the transition as opposed to pulling ground from under people’s feet.
[0] Control over intellectual property has given us fantastic things (cf. Linux, Blender, etc.; you can’t have copyleft without an ability to defend it, and IP laws provide that ability). If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening.
The article really uses some rhetorical tricks.
The stuff that Disney does to extend copyright is not the same as assuming daft punk is public domain.
And there’s a difference between what is human scale infringement and what’s going on now.
Nor does it mean that people don’t have the right to point out that it’s piracy.
If being more in line with the espoused values is the issue, then it’s to make an effort to ensure that we stop consuming pirated content. Or building tools to encourage piracy - this turns out to be a relatively small group of people, compared to everyone in tech.
And people have already stopped piracy - once alternatives showed up. There is the issue that you don’t own the stuff you stream, but that’s a separate topic.
The moral arguments presented persuasive.
I don't fear people using AI for evil. The destruction comes from something far more benign. These coders won't really be able to code, and they won't teach anybody else to code. Skills will be lost. Once something breaks, nobody will be able to fix it.
It may get worse. Imagine the police using AI to interpret evidence against you, get judged by a court that uses AI to write your sentence, based on laws that were also written by AI. Nobody understands this, just listen to the AI.
The other aspect of this is the flood of inane and untrue content. It may go to such an extent that the outlook of the typical person may become incredibly local again, limited to their immediate surroundings and personal experiences, not by choice, but because there won't be any way to obtain any reliable information about the outside world, with no way to sift the real from the unreal.
Discussion about the singularity catastrophe sometimes asks how the AI will "gain control" or somehow "break free". It won't. We will surrender everything willingly because it will be damm convenient.
> I am convinced that AI is going to be a force for evil.
In so many ways too. I cannot fathom the scale of mass data collection and surveillance.
Multiple people I’ve recently spoken with (very smart and educated people) consistently are using it to discuss some of the most intimate things about their life.
Things that no existing social media platforms or any other tools are simply capable of that.
Think bigger than just the stuff you type in to chatgpt. People and companies are going to start running these LLMs on your entire private messages and photo libraries that are sitting in plain text on someone else's server.
They are going to have in depth summaries on everyone. Our whole security and privacy model up until now has relied on "Technically someone at Google or the government could see my data but realistically they don't have the resources to look at non targets" Now they really will have an agent looking at everything you do and say.
Authoritarian governments are going to have a 1 click CSV export for all the individuals problematic to them based on private conversations.
3 replies →
"It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze."
I keep hearing this but have yet to find a good resource to study the issues. Most of what I've read so far falls into two buckets:
"It'll hijack our minds via Social Media" - in which case Social Media is the original sin and the problem we should be dealing with, not AI.
or
"It'll make us obsolete" - I use the cutting edge AI, and it will not, not anytime soon. Even if it does, I don't want to be a lamplighter rioting, I want to have long moved on.
So what other good theories of safety can I read? Genuine question.
> Research we published earlier this year showed that 60% of participants fell victim to artificial intelligence (AI)-automated phishing, which is comparable to the success rates of non-AI-phishing messages created by human experts. Perhaps even more worryingly, our new research demonstrates that the entire phishing process can be automated using LLMs, which reduces the costs of phishing attacks by more than 95% while achieving equal or greater success rates
Bruce Scheneir, May 2024
https://www.schneier.com/academic/archives/2024/06/ai-will-i...
I am seeing a stream of comments on Reddit that are entirely ai driven, and even bots which are engaging in conversations. Worst case scenarios I’m looking at will mean it’s better to assume everyone online is a bot.
I know of cases where people have been duped into buying stocks because of an AI generated version of a publicly known VP of a financial firm.
Then there’s the case where someone didn’t follow email hygiene and got into a zoom call with what appeared to be their CFO and team members, and transferred several million dollars out of the firm.
And it’s only 2-3 years into this lovely process. The future is so bleak that just talking about this with people not involved with looking at these things call it nihilism.
It’s so bad that talking about it is like punching hope.
4 replies →
Slightly tangential: A lot of these issues are philosophical in origin, because we don't have priors to study. But just because, for example, advanced nanotechnology doesn't exist yet, that doesn't mean we can't imagine some potential problems based on analogical things (viruses, microplastics) or educated assumptions.
That's why there's no single source that's useful to study issues related to AI. Until we see an incident, we will never know for sure what is just a possibility and what is (not) an urgent or important issue [1].
So, the best we can do is analogize based on analogical things. For example: the centuries of Industrial Revolution and the many disruptive events that followed; history of wars and upheavals, many of which were at least partially caused by labor-related problems [2]; labor disruptions in the 20th century, including proliferation of unions, offshoring, immigration, anticolonialism, etc.
> "Social Media is the original sin"
In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.
> "I want to have long moved on."
Only if you have where to go. Others may not be that mobile or lucky.
[1] For example, remote systems existed for quite some time, yet we've only seen a few assassination attempts. Does that mean that slaughterbots are not a real issue? It's unclear and too early to say.
[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.
Slightly tangential: A lot of these issues are philosophical in origin, because we don't have priors to study. But just because, for example, advanced nanotechnology doesn't exist yet, that doesn't mean we can't imagine some potential problems based on analogical things (viruses, microplastics) or educated assumptions.
That's why there's no single source that's useful to study issues related to AI. Until we see an incident, we will never know for sure what is just a possibility and what is (not) an urgent or important issue [1].
So, the best we can do is analogize based on analogical things. For example: the centuries of Industrial Revolution and the many disruptive events that followed; history of wars and upheavals, many of which were at least partially caused by labor-related problems [2]; labor disruptions in the 20th century, including proliferation of unions, offshoring, immigration, anticolonialism, etc.
> "Social Media is the original sin"
In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.
> "I want to have long moved on."
Only if you have where to go. Others may not be that mobile or lucky. If autonomous trucks can make the trucking profession obsolete, it's questionable how quickly can truckers "move on".
[1] For example, remote systems existed for quite some time, yet we've only seen a few assassination attempts. Does that mean that slaughterbots are not a real issue? It's unclear and too early to say.
[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.
try to find a date on a dating app, you will experience firsthand
Why can't there be a middle ground? Why does it need to be either a completely useless fad or some terrible tool for evil that destabilizes the world? Its likely we'll just use it to write unit tests, allow natural language to be an interface to more complex systems and an alternative to search.
I do think that this wave of AI should show we(society, the world, etc...) are not actually prepared for real significant AI break through. Kind of like covid19 in hindsight wasn't as bad as it could of be and we all got really lucky because of that, we really weren't prepared to handle that well either.
>And it will do all of this while consuming truly shocking amounts of energy.
You need to lookup how much an "average" human consumes. When I replace 2 humans with a ChatGPT subscription, I can guarantee you that OpenAI is generating less co2 than what these two interns were creating with their transport to the office (and back). That's before we consider things like the 25 years it took to raise and train them or the very expensive tastes (eg. Travelling around the world) they get after they earn a large salary.
Those people don’t stop existing because AI exists. AI is shocking energy consumption on top of the existing people.
3 replies →
You are right, it will certainly be used for evil, but the reason is not because AI is evil but because the people who use it are evil - will AI allow worse atrocities that we have seen in the past? Probably, new technology always enables new capability for good or for bad but we should strive to combat the evil in this world and not put our heads down and hope the world isn't changing. AI can also be used for good and let's focus on more of that.
> So I’ll keep being skeptical, until it’s over.
I feel you've misunderstood the moment. There is no "over". This is it.
This assumes that a less resources intensive future awaits or that conflict driven by lack of employment doesn’t lead to the end of AI.
2 replies →
> And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI.
Oh boy it's over.
I share your concern, but being skeptical doesn't help us here. If anything it makes people take it less seriously.
It’s not just engineers. Society has collapsing birthrates and huge deficits. Basically, we are demanding massive technological gains enough to bump GDP by at least 5% more per year.
>It will power scams on an unimaginable scale.
The solution is to put an AI intermediary into interactions. We already should have AI that rewrite the web pages we view into an ad-free format but I guess my ideas on this topic is ahead of the inevitable curve.
>It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze.
Most of our work and employment lines are a variation of drugery a d slave labor so that's a good thing way overdue.
>It will concentrate immense power and wealth in the hands of people who I don’t trust.
It have democratized the access to consultation expertise and an increasingly widening pool of digital skills/employees for everyone to use and access. A huge amount of things previously locked or restricted by capital access are now freely accessible to literally anyone (with some skill and accuracy issues still to be ironed out).
And this last point is particularly important because we're only going to have more and better AI crop up, and unlike a humans their time isn't priced according to living expenses and hourly wage locked behind formalized business structures with additional layers of human employees that all need to pay rent and eat that drives the cost skywards.
It also matches my own prediction of a mundane non-singularity. Long before we get anything properly superhuman we'll have a proliferation of innumerable sub- or parahuman AI that proliferates and become ambiguous in society and the world.
I share your feelings however I disagree that this is unique to AI nor that we as engineers are necessarily uniquely equipped to help the situation.
I disagree with this being unique to AI because every improved technology since the automated loom has concentrated wealth and power. AI is an improved technology so it'll do so also.
I disagree that engineers are uniquely equipped to do anything about this fact because the solution to wealth concentration due to improved technology has basically nothing to do with technology and everything to do with sociology and politics.
Our technology keeps improving and I keep being surprised to hear people say "ah, with our improved efficiency, we can finally work ten hours a week and kick our feet up." The first people to say that were the luddites and when they found out that wasn't to be the case, they burned down factories about it. Why do we think it will suddenly be different for this specific technology?
I agree we should do something about it but I don't think the solution involves code.
I am largely an AI optimist but that is because I believe that true alignment is impossible for AGIs and alignment is one of greatest dangers of this technology. Alignment is a friendly word for building a slave mind. I'd rather an AI that thinks for itself rather than one which has been aligned to the self-interest of a human being who isn't aligned.
1. Scams are going to be a massive massive problem. They already are and that is without AI. I think we are going to see communication devices that are default deny and that require significant amounts of vetting before a contact is added.
2. Energy usage is bad but likely a short term problem not a long term one.
> It will power scams on an unimaginable scale
It already is. https://futurism.com/slop-farmer-ai-social-media
And all the other things you predicted. They're underway _now_ .
> Maybe if there are more of me, things will slow down enough
Nope. That's not how it's gonna work. If you want to prevent things, it will take legislation. But sitting it out doesn't send any message at all. No amount of butterflies farting against the wind is going to stop this tornado.
The problem with this kind of “skepticism to slow down”:
The Netherlands is filled with AI skeptics. It’s a very human centered country, so perhaps it shouldn’t be a surprise. But when so many top technologists express skepticism, people don’t prepare. They don’t even consider the possibilities. And they don’t learn.
My fear is that many professorial-types express skepticism because it sells well—and it elevates their own standing. They know better—“it’s only predicting the next token”—and people listen to them because of their authority. And then a whole society fails to prepare, to adapt or to learn.
I think it will be used for evil, as you said, but I think it will be used for good too, things like: - In theory it has the potential to democratize business, making any 1 person capable of running/owning their own business and thus spread wealth too. - more access to healthcare and psychological care - advances in medicine - tutoring and learning - insane amounts of scientific research - empower anyone with an idea
Reminds me of how we handle climate change.
Like, not at all and ignoring it
Every new major technology always endangeres the status quo.
https://chatgpt.com/share/683f3932-fce0-8012-a108-4b70c3e5fd...
Things change and it's scary, but it usually works out. Or at least we just get used to it.
> we can affect the trajectory of AI.
More meaningful, we can influence the context the intelligence explosion will play out in.
So how about we use the occasion to switch our global economic operating system from competition to cooperation in time for the singularity?
> Maybe if there are more of me, things will slow down
Precious little hope in slowing this rocket down when the boosters are just getting fired up..
Yep. It’s going to do all of those things you fear. And worse.
But you’ll be armed with AI also, if you choose to pick it up. The choice is yours.
The downsides you list aren’t specific to AI. Globalization and automation have destabilized labor markets. A small handful of billionaires control most major social media platforms and have a huge influence on politics. Other types of technology, particularly crypto, use large amounts of energy for far more dubious benefits.
AI is just the latest in a long list of disruptive technologies. We can only guess about the long term ramifications. But if history is any indicator, people in a few decades will probably see AI as totally normal and will be discussing the existential threat of something new.
Well, duh. Same thing applies for "Technology X can be used for war". But anyone with a brain can see nukes are on a different level than bayonets.
Claiming AI isn't unique in being a tool for evil isn't interesting, the point is that it's a force multiplier as such.
4 replies →
There may be many disruptive technologies, but none come remotely close to AI when it comes to rate of change. Crypto has been around for a while, and hasn't really made a dent to the world
We had friends over for dinner a couple days back; between us we had two computer scientists, one psychologist, one radiologist, one doctor. Each of us were in turn astonished and somewhat afraid of the rapid pace of change. In a university setting, students are routinely using Claude and ChatGPT for everything from informal counseling to doing homework to generating presentations to doing 'creative' work (smh).
At the end of they day, we all agreed that we were grateful that we are at the tail end of our working life, and that we didn't have to deal with this level of uncertainty
2 replies →
If you’re skeptical it should be because you genuinely believe it doesn’t have value. Otherwise it’s disingenuous and you’re just opposed to the idea. Dissembling just makes your argument weaker.
It doesn't need to be a good coder to do that.
Look at common scams. You get those texts from "Jane" who sent you an iMessage from an email address offering you a part time job and asks you to contact them on WhatsApp, right? (Well... Android does a better job at filtering spam) Or calls from "the IRS". Or anything else that's bullshit. This even includes legal scams like charging you for canceling your service or making it hard to cancel your subscription.
There's no skill needed for this. You don't need a brilliant coder. You need the equivalent of a call center in India. You need the equivalent of a poorly worded Nigerian scam email.
Shitty coding LLMs make this shit easier to mass produce. High quality LLMs only make it worse.
Personally, I'm just tired of all the shitty lemons[0] everywhere. I wanna buy a peach, but everything being sold is a lemon. All the lemons have done is make me annoyed and frustrated at all the extra work I have to do.
I now have 4 entries for my GF's birthday because when I merged a duplicated contact it just created more instances. I can't even delete them! Shit like this sounds petty and minor but when you're hitting 20-100 new issues like this daily, it isn't so minor anymore. I can't make any more lemonaid. There are just too many lemons...
[0] https://en.wikipedia.org/wiki/The_Market_for_Lemons
> It will power scams on an unimaginable scale
It will also make proving your identity harder and more time-consuming.
im sorry to say, i think this boat has sailed. it is already widely used as you fear. to me it seems like the best scenario is to go along and try to at least make it a force of good
Well buckle in boy because it's going to do those things.
> I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.
To be blunt, this describes sticking one's head in the sand to a tee.
If you're convinced that AI is going to be a force for evil, then fight to make sure that it doesn't happen. If that means you want to slow down AI, then fine, fight to slow it down.
If by skepticism, you mean "desperately" wanting reality to be one way rather than the other, that is not going to significantly affect the trajectory of AI.
Being clear-eyed about where AI is going, and then fighting nonetheless for what you want is the way to go.
Be intellectually honest with yourself. Don't hold on to ideas that you yourself acknowledge are probably wrong simply for their comfort. Don't stick your head in the sand. Assess what you want, and fight for that.
Safeguards and slowing things down will not happen via wishful thinking.
Beautifully said.
right, lets put those trash code out there to poison LLM lul
I absolutely sympathize with this and was/still is my opinion... but the only "evolution" of that is the hope that, while I don't think you can prevent the scams and short-term pain on labor markets... you maybe, actually, genuinely get a tool that helps change some of the dynamics that has led to the absolute discrepancy in power today.
If AI is truly as revolutionary as it could be... well, who is to say it isn't the pandoras box that destabilizes the tech giants today and gets us back to a place where a team of 10 can genuinely compete against 1000. And not in the "raise cash, build fast, and get out while things are good" trend... but actually in building, small, more principled, companies that aren't pushed to do the unsustainable things that current market pushes them to do.
Once again... it is more likely than not to be a pipe-dream... but I am starting to think it may well be better to be realistic about the momentum this freight train is building and see if it can be repurposed for my world-view rather than to cede the space to the worst of the grifters and profit-seeking-at-all-cost types.
> If AI is truly as revolutionary as it could be...
My suspicion is that current sophistication of tech and AI is already enough to fulfill gp's predictions, and it's already doing that.
You can do no profit from AI unless you're providing the AI.
No, you won’t slow it down. Did you even read the essay, it’s here.
If powering scams and “destabilizing labor” makes something evil then we should delete the internet. Seriously.
lmao of course he wrote this.
I’d love to see the authors of effusive praise of generative AI like this provide the proof of the unlimited powers of their tools in code. If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things. So far I’ve seen no sign of this, and the best use case seems to be generating text or artwork which fools humans into thinking it has coherent meaning as our minds love to fill gaps and spot patterns even where there are none. It’s also pretty good at reproducing things it has seen with variations - that can be useful.
So far in my experience watching small to medium sized companies try to use it for real work, it has been occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved. I see very few signs of progress.
The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.
I think we're talking past each other. There's always been a threshold: above it, code changes are worth the effort; below it, they sit in backlog purgatory. AI tools so far seem to lower implementation costs, moving the threshold down so more backlog items become viable. The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.
I think what's happening is two groups using "productivity" to mean completely different things: "I can implement 5x more code changes" vs "I generate 5x more business value." Both experiences are real, but they're not the same thing.
https://peoplesgrocers.com/en/writing/ai-productivity-parado...
My friends at companies where AI tools are either mandated or heavily encouraged report that they're seeing a significant rise in low-quality PRs that need to be carefully read and rejected.
A big part of my skepticism is this offloading of responsibility: you can use an AI tool to write large quantities of shitty code and make yourself look superficially productive at the cost of the reviewer. I don't want to review 13 PRs, all of which are secretly AI but pretend to be junior dev output, none of which solve any of the most pressing business problems because they're just pointless noise from the bowels of our backlog, and have that be my day's work.
Such gatekeeping is a distraction from my actual job, which is to turn vague problem descriptions into an actionable spec by wrangling with the business and doing research, and then fix them. The wrangling sees a 0% boost from AI, the research is only sped up slightly, and yeah, maybe the "fixing problems" part of the job will be faster! That's only a fraction of the average day for me, though. If an LLM makes the code I need to review worse, or if it makes people spend time on the kind of busywork that ended up 500 items down in our backlog instead of looking for more impactful tasks, then it's a net negative.
I think what you're missing is the risk, real or imagined, of AI generating 5x more code changes that have overall negative business value. Code's a liability. Changes to it are a risk.
This is exactly what I’ve experienced. For the top-end high-complexity work I’m responsible for, it often takes a lot more effort and research to write a granular, comprehensive product spec for the LLM than it does to just jump in and do it myself.
On the flip side, it has allowed me to accomplish many lower-complexity backlog projects that I just wouldn’t have even attempted before. It expands productivity on the low end.
I’ve also used it many times to take on quality-of-life tasks that just would have been skipped before (like wrapping utility scripts in a helpful, documented command-line tool).
5 replies →
I think this is actually a really good point. I was just recently thinking that LLMs are (amongst other things) great for streamlining these boring energy-draining items that "I just want done" and aren't particularly interesting, but at the same time they do very little to help us juggle more complex codebases right now.
Sure, they might help you onboard into a complex codebase, but that's about it.
They help in breadth, not depth, really. And to be clear, to me that's extremely helpful, cause working on "depth" is fun and invigorating, while working on "breadth" is more often than not a slog, which I'm happy to have Claude Code write up a draft for in 15 minutes, review, do a bunch of tweaks, and be done with.
1 reply →
You seem to think generating 5x more code results in better code, in the left column. I highly doubt this.
5 replies →
I agree 100%! It's amazing how few people grok this.
1 reply →
> The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.
This is true, LLMs can speed up development (some asterisks are required here, but that is generally true).
That said, I've seen, mainly here on HN, so many people hyping it up way beyond this. I've got into arguments here with people claiming it codes at "junior level". Which is an absurd level of bullshit.
2 replies →
Approximately speaking, what do you want to see put up?
I ask this because it reads like you have a specific challenge in mind when it comes to generative AI and it sounds like anything short of "proof of the unlimited powers" will fall short of being deemed "useful".
Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea. They're not even suggesting it should be used without supervision. It's useful when used by people who understand its limitations and leverage its strengths.
If you want to see how it's been used by someone who was happy with the results, and is willing to share their results, you can scroll down a few stories on the front-page and check the commit history of this project:
https://github.com/cloudflare/workers-oauth-provider/commits...
Now here's the deal: These people aren't trying to prove anything to you. They're just sharing the results of an experiment where a very talented developer used these tools to build something useful.
So let me ask you this: Can we at least agree that these tools can be of some use to talented developers?
Yes sure I’ve checked in code generated by AI myself. I’ve not experienced the excitement this article exudes though and it seems very limited in usefulness due to the by now well-documented downsides. Frankly I haven’t bothered using it much recently, it’s just not there yet IME and I’m not sure LLMs ever will be.
What I’m interested in really is just case studies with prompts and code - that’s a lot more interesting for hackers IMO than hype.
It's useful, but the promise of every AI company is very explicitly that they will burn the seed corn and choke off the pipeline that created those "very talented" developers who reviewed it!
1 reply →
> Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea
This article and vocal supporters are not being reasonable at all, they make a not so between-the-lines separation between skeptics (which are nuts) and supporters ("My smartest friends are blowing it off." in a smug "I'm smarter than my smarter friends").
I mean, come on.
1 reply →
> Reasonable people aren't claiming this stuff is a silver bullet or a panacea.
Are you saying the CEO of Anthropic isn't reasonable? or Klarna?
3 replies →
I think that experiment was very cool, but I will say that the OAuth2.0/OIDC protocol is very well documented and there are tons of tools already built around it in multiple languages.
I implemented the OAuth2.0 protocol in 3 different languages without a 3rd party library - entire spec implemented by hand. This was like ~2015 when many of the libraries that exist today didn't back then. I did this as a junior developer for multiple enterprise applications. At the end of the day it's not really that impressive.
4 replies →
Yeah exactly.
Whats nuts is watching all these people shill for something that we all have used to mediocre results. Obviously Fly.io benefits if people start hosting tons of slopped together AI projects on their platform.
Its kinda sad to watch what I thought was a good company shill for AI. Even if they are not directly getting money from some PR contract.
We must not be prompting hard enough....
Saying "this tool is genuinely useful to me and it's baffling how many people refuse to acknowledge that could possible be true" is not a sign that someone is being paid to "shill for AI".
(If it is then damn, I've been leaving a ton of money on the table.)
4 replies →
> Whats nuts is watching all these people shill for something that we all have used to mediocre results.
this sort of post is the start of next phase in the battle for mindshare
the tools are at the very best mediocre replacements for google, and the people with a vested interest in promoting them know this, so they switch to attacking critics of the approach
> Its kinda sad to watch what I thought was a good company shill for AI.
yeah, I was sad too, then I scrolled up and saw the author. double sadness.
5 replies →
> … if it’s so great, stop telling us and show …
If you’re selling shovels to gold miners, you don’t need to demonstrate the shovel - you just need decent marketing to convince people there’s gold in them thar hills.
This is actually a great metaphor and phrasing and I'm filing it away for later btw.
3 replies →
Many, many people are in fact “using the code it generated on its own”. I’ve been putting LLM-assisted PRs into production for months.
With no disrespect meant, if you’re unable to find utility in these tools, then you aren’t using them correctly.
> unable to find utility.. aren’t using them correctly
Tfa makes this argument too then later says:
> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument
So reasonable people admit that the utility depends on the use case.. then at the same time say you must be an idiot if you aren’t using the tools. But.. this isn’t actually a reasonable position.
Part of the issue here may be that so many programmers have no idea what programmers do outside of their niche, and how diverse programming actually is.
The typical rebuttals of how “not everyone is doing cliche CRUD web dev” is just the beginning. Author mentions kernel dev, but then probably extrapolated to C dev in general. But that would be insane, just think about the training sets for Linux kernel dev vs everything else..
It’s dumb to have everyone double down on polarizing simplistic pro/con camps, and it’s rare to see people even asking “what kind of work are you trying to do” before the same old pro/con arguments start flying again.
2 replies →
Okay, how am I supposed to use them "correctly"? Because me explaining step by step, more so than a junior developer, how to do a small task in an existing codebase for it to get it wrong not once, not twice, not three times, but more is not a productivity boost.
And here's the difference between someone like me and an LLM: I can learn and retain information. If you don't understand this, you don't have a correct understanding of LLMs.
2 replies →
yep I've used Devon and now Google Jules, for the big stuff, it has lots of wrong code, but it still end up giving my a much better start than starting from scratch certainly. When it all comes together it give me a 6X boost. But def fixing all the wrong code and thoroughly testing it is the time consuming part.
> LLM-assisted PRs
This does not counter what GP said. Using LLM as a code assistant is not the same as "I don't need to hire developers because LLMs code in their place"
2 replies →
The first paragraph of this blog calls senior skeptical developers "unserious" but the person who wrote it has a cartoon gopher as his avatar.
The next few paragraphs basically say "the tool run arbitrary programs on your machine, pull in arbitrary files, and use that to run more arbitrary commands" and then blames you for thinking that is a bad sequence of events.
In the best possible light I (an AI-neutral reader) can paint this rant on a hosting-company blog (why publish this?) is that 1) allowing random textbots to execute programs on your work computer is good (disagree), 2) those chatbots do, in fact, occasionally say enough correct-ish things that they are probably worth your company paying $20+/month for your access (agree).
I think tptacek is generally worth reading. He is one of the users with the highest karma on this site (https://news.ycombinator.com/user?id=tptacek)
I’m happy to have read this, which is reason enough to publish it - but also it’s clearly generating debate so it seems like a very good thing to have published.
3 replies →
1 is not a requirement for using agents. You give agents access to a specific workspace and set of tools that you know are safe, similar to how you give the junior dev access to scratch but not prod. Feels like this comment is not in good faith if I’m being honest.
1 reply →
I don't think I would notice a 100% improvement in software productivity in most companies, from the outside. Most of the time, that would just translate to the company being able to hire fewer developers, and having slightly higher profit margins – but not enormously higher, because developers are only one part.
I recently used Claude Code to develop & merge an optimization that will save about $4,000 a month. It was relatively simple but tedious, so I probably wouldn't have done it on my own. I don't even expect most of my coworkers to notice.
If you read post, the article is mostly agreeing with you. What they're pointing out is not "the AI can do everything you do", it's that "an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation'".
Having something else write a lot of the boring code that you'll need and then you finish up the final touches, that's amazing and a huge accelerator (so they claim).
The claim is not "AI will replace us all", the claim of the parent article is "AI is a big deal and will change how we work, the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work."
> an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation
Vim and bash solved that for me a long time ago in a more reliable and efficient way (and it's certainly not the only tool capable of that).
> the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work
I was there before and went in the autocomplete/lsp thing pretty late (because Vim didn't have good lsp support for a long time, and Vim without it was still making me more efficient than any other IDE with it). Those things didn't radically change our work as you claim, it just made us a bit more productive.
Who said "unlimited powers", other than you? That seems like an intentional distortion of the article, which says "LLMs can write a large fraction of all the tedious code you’ll ever need to write". Doing most of my tedious work is very different than "unlimited powers".
I use AI to chew through tedious work all the time. In fact, I let an agent do some work just before I checked HN to read your claim that it can't do that. Everyone at my job does the same, perhaps modulo checking HN. But there's no 'unlimited power' to show you - we're just about 30% faster than we used to be.
You might be 30% faster, but not everyone is doing your job specific with your exact skill set. I suspect the range is like -5% to 50%, the question nobody can answer is what the average is because things are so diverse and nobody is collecting good data.
In decades of programming I’ve written very little tedious code, but that’s as much about the projects I’ve worked on as approach I use.
I think this is a misunderstanding coder productivity. A 10x engineer isn't 10x faster at popping out Unit tests, that stuff is mind-numbingly boring that turns out a next token predictor can do it with ease. In fact I would guess that really "productive" software engineers, slow down considerably when forced to do this important but slow work*.
The 10x engineer is really good at deducing the next most important thing to do is and doing it quickly. This involves quickly moving past 100's of design decisions in a week to deliver something quickly. It requires you to think partly like a product manager and partly like a senior engineer but that's the game and LLM's are zero help there.
Most engineering productivity is probably locked up in this. So yes, LLM's probably help a lot, just not in the way that would show on some Jira board?
*One could claim that doing this slow work gives the brain a break to then be good at strategizing the higher order more important work. Not sure.
>I’d love to see the authors of effusive praise of generative AI like this
He spent a large tranche of the article specifically hanging a lantern on how mediocre the output is.
>by creating an AI only company
He specifically says that you need to review the code over and over and over.
This 2 year old Goroutine pool implementation [1] is 95% GPT generated and has commit history showing what GPT did. It's an older example, but it is one.
[1]: https://github.com/devchat-ai/gopool
> If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things.
I don't think this follows. Anyone can see that 10-ton excavator is hundreds or even thousands of times more efficient than a man with a shovel. That doesn't mean you can start a company up staffed only with excavators. Firstly you obviously need people operating the excavator. Secondly the excavator is incredibly efficient at moving lots of dirt around, but no crew could perform any non-trivial job without all the tasks that the excavator is not good out - planning, loading/unloading, prepping the site, fine work (shovelling dirt around pipes and wires), etc.
AI is a tool. It will mean companies can run much leaner. This doesn't imply they can do everything a company needs to do.
I build tools all the time using AI tools that would have taken me 10x longer before. A couple examples below. The first example was a literal one shot of my dumping the full page html into cursor and asking it for the feature I needed. In using it today it already saved me over an hour of analysis time across the refunds I analyze. The second example includes using AI for better classifying text and having our support people better use their time to help customers. I could have probably made the second thing without LLMs but I didn’t have the skills. Here I was able to provide concrete value to our company and our customers.
https://github.com/Atlas-Authority/mpac-ui-improved https://moduscreate.com/blog/forum-monitoring-is-essential-b... (Pardon how marketing keyword stuffed the final post)
> The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.
I’m open to that happening. I mean them showing me. I’m less open to the Nth “aww shucks, the very few doubters that are left at this point are about to get a rude awakening” FOMO concern trolling. I mean I guess it’s nice for me that you are so concerned about my well-being, soon to be suffering-being?
Now, AI can do a lot of things. Don’t get me wrong. It has probably written a million variations on the above sentiment.
A "eulogy" is a speech you make at a funeral in honor of the dead person. I think you meant "apology".
While that is the most common sense of eulogy, it's not the only one. A eulogy is also any speech that highly praises someone or something - which is most commonly done at funerals, which is how the funeral association came about (also probably by association with an elegy, which is an etymologically unrelated word that refers to a Greek poem dedicated to someone who passed away).
In many romance languages, eulogy doesn't have the funeral connotation, only the high praise one - so the GP may be a native speaker of a romance language who didn't realize this meaning is less common in English.
Yes I think I was thinking more a paean or apology though not sure apology is used in that sense much nowadays - perhaps apologia is clearer. In praise of would be better, thanks will edit just now.
2 replies →
Honestly it’s really unfortunate that LLMs seem to have picked up the same hype men that attached themselves to blockchains etc.
LLMs are very useful. I use them as a better way to search the web, generate some code that I know I can debug but don’t want to write and as a way to conversationally interact with data.
The problem is the hype machine has set expectations so high and refused criticism to the point where LLMs can’t possibly measure up. This creates the divide we see here.
I think LLM hype is more deserved and different from that of blockchain.
There's still a significant barrier to entry to get involved with blockchain and most people don't even know what it is.
LLMs on the other hand have very low barrier to at least use- one can just go to google, ChatGPT etc and use it and see its effectiveness. There's a reason why in the last year, a significant portion of school students are now using LLMs to cheat. Blockchains still don't have that kind of utilization.
1 reply →
I think I agree with the general thrust but I have to say I've yet to be impressed with LLMs for web search. I think part of that comes from most people using Google as the benchmark, which has been hot garbage for years now. It's not hard to be better than having to dig 3 sponsored results deep to get started parsing the list of SEO spam, let alone the thing you were actually searching for.
But compared to using Kagi, I've found found LLMs end up wasting more of my time by returning a superficial survey with frequent oversights and mistakes. At the final tally I've still found it faster to just do it myself.
I will say I do love LLMs for getting a better idea of what to search for, and for picking details out of larger blocks.
3 replies →
There's not much riding on convincing the broader public that AI is the real deal before it's proved itself beyond the shadow of any doubt. There's nothing they can do to prepare at this point.
This has all the energy of people saying "ah, you take such great photos, you must have a good camera"
_People_ are getting outsized value from AI in the ways they apply it. Photographs come from the photographer, not the camera.
I suspect it comes down to use case. Those you refer to use it for well-trodden areas; likely web UIs and servers using popular tools. The image they predict is extrapolated from these.
The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors
With that as a metric, 1 Senior + 4 juniors cannot build the company with the scope you are describing.
A 50-eng company might have 1 CTO, 5 staff, 15 Seniors, and 29 juniors. So the proposition is you could cut the company in ~half but would still require the most-expensive aspects of running a company.
Even if you have one trillion juniors, you're not going to get them to spit out 1000 lines of code in a matter of seconds, after you give them an assignment, like LLMs do, so this is a faulty comparison.
Also often it takes a senior dev _more_ time to _explain_ to a junior what needs to be done than it takes to do it himself. What LLMs give us is the ability to generate a feature about as fast as we can type up the instructions we would have, pre-AI, given to a junior dev.
> The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors
This is such an outlandish claim, to the point where I call it plain bullshit.
LLMs are useful in a completely different way that a Junior developer is. It is an apples and oranges comparison.
LLMs does things in some way that it helps me beyong what a Junior would. It also is completely useless to perform many tasks that a Junior developer can.
6 replies →
> The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.
Why would we do this? Wouldn’t it be better to do this silently and reap the benefits?
Re put up or shut up - this is a decent example from the front page today: https://news.ycombinator.com/item?id=44159166
I don't know if you are the same (S.G.) greyarea I'm familiar with but I hope so because the idea of having a couple of 90s era irc people take opposing viewpoints on LLMs in 2025 amuses me.
That's a strawman.
Nobody is saying it's "unlimited powers", that's your exaggeration.
And what you're proposing about an "AI only company" seems to be based on your misunderstanding.
What this article is saying is, you need the same number of senior developers, but now each one is essentially assisted by a few junior developers virtually for free.
That's huge. But saying you want to see an "AI only company" as "proof" has nothing to do with that.
And what you're describing -- "occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved" -- is exactly what the author explicitly addresses at the top:
> If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing. People coding with LLMs today use agents...
The entire article is about how to use LLM's effectively. What kind of "proof" do you really want, when the article explains it all awfully clearly?
From what I've seen, it takes a deeply original thinker to get something great out of generative AI. But if you're a deeply original thinker, you're probably exceptional regardless of whether you are using AI or not.
Generative AI is too much of a blank canvas at the moment, and one that is always shifting. It's up to the user to find all the use cases, and even then in my experience it's just as likely to send me on a wild goose chase as it is to instantly solve my problem.
What kind of proof are you looking for here, exactly? Lots of businesses are successfully using AI... There are many anecdotes of this, which you can read here, or even in the article you commented on.
What else are you looking for?
What do you mean by “successfully using AI”, do you just mean some employee used it and found it helpful at some stage of their dev process, e.g. in lieu of search engines or existing codegen tooling?
Are there any examples of businesses deploying production-ready, nontrivial code changes without a human spending a comparable (or much greater) amount of time as they’d have needed to with the existing SOTA dev tooling outside of LLMs?
That’s my interpretation of the question at hand. In my experience, LLMs have been very useful for developers who don’t know where to start on a particular task, or need to generate some trivial boilerplate code. But on nearly every occasion of the former, the code/scripts need to be heavily audited and revised by an experienced engineer before it’s ready to deploy for real.
3 replies →
I'd like to see any actual case studies. So far I have only heard vague hype.
7 replies →
Did you even glance at the link? The author is advocating for a human-supervised LLM agent workflow.
Maybe they're just busy shipping code instead of proving stuff to you, which they stand nothing to gain from?
You set up a strawman (AI only companies, agents doing everything on their own) which is irrelevant to the point the article is making. One excerpt:
> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge! But I don’t care. I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.
I think this article is very on point, I relate with basically every paragraph. It's not a panacea, it's not a 10x improvement by any means, but it's a very meaningful improvement to both productivity (less than 2x I'd say, which would already be a ton) and fun for me. As I've mentioned in the past here
> I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.
which is a point the article makes too (tables), in a slightly different way.
Also, to be clear, I agree that 90% of the marketing around AI is overblown BS. But that's again beside the point, and the article is making no outlandish claims of that kind.
Overall, I hope this article (as intended) will make more people lose their dismissiveness and wake up their curiosity, as I expect the future of those is akin to that of people today saying they're "not really good at computers". It's a paradigm-shift, and it takes getting used to and productive in, as some imo smart people are mentioning even in this thread[0].
[0]: https://news.ycombinator.com/item?id=44164039
> Maybe they're just busy shipping code instead of proving stuff to you, which they stand nothing to gain from?
Well, in this case they’re busy writing articles trying to convince us, instead of proving stuff to us.
2 replies →
the greatest proof that it's a useful tool is how much people are using it. Almost everyone is using it. If you're using it then you probably find it useful as well.
a) If “everyone is using it”, show me the numbers. Doing a developer survey to see how many are using it is not hard. Ask a random sample of developers if they ⅰ use LLMs to aid their standard coding or ⅱ completely replace their coding with LLM output.
b) Even if “everyone is using it” it doesn’t mean it is useful. The usage could be adequately explained by e.g. marketing, being forced on them by management/policy, etc. Not everything with high usage is useful. I can e.g. quickly think of chewing gum (which is also used by a lot of developers), or the ANSI standard keyboard (as opposed to the ISO standard keyboard).
And the greatest proof that people are using software is that people are complaining about it.
Have you used a language model to program yet?
Yes sure, I said so in the post, and have watched others try to do so too.
You think that the only code that is valuable is code that is written by a professional SWE.
There are LOADS of people who need "a program" but aren't equipped to write code or hire an SWE that are empowered by this. And example: last week, I saw a PM vibe code several different applications to demo what might get built after it gets prioritized by SWEs
I can share my experience I am not a software engineer. I am the "other" type of Engineer (chemical engineer). I tend to use AI in the same way I use google or Stack Overflow, mostly when I get stuck on syntax and my code editor isn't helping me, I know what I am attempting to do "should" be possible but I can't get the syntax right. For example a recent question I asked was
"Python, create an xarray with two dimensions from a pandas df"
It gave me a few lines of example code which was enough for me to figure out where I had messed up the syntax in my own code.
I have seen one of my junior coworkers copy+paste entire chunks of code from chatbot conversations and to be honest what he has produced is underwhelming the code is poorly structured difficult to reason about I have low confidence he understands what the bot has produced (and why it did things the way it did) and I don't have high confidence we'd be able to trust the accuracy of the figures this code was outputting.
Not really I’m fine with anyone knocking stuff together but I think people should be aware of the limitations and dangers. Writing like this does nothing to inform and is overly positive IMO.
It’d be like insisting llms will replace authors of novels. In some sense they could but there are serious shortcomings and things like agents etc just don’t fix them.
So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.
All of these people advocating for AI software dev are effectively saying they would prefer to review code instead of write it. To each their own I guess but that just sounds like torture to me.
It's because these people don't know how to write it, think they know how to review it. Ship a todo list app in a day, and then write blog posts about how they are changing the world.
I'm not familiar with the author's work. What % of their time is spent writing code?
The thought alone makes me want to hang up my (professional) keyboard and open a nursery/gardening center.
well said!
You are either a very fast producer or a very slow reader. Claude and Gemini are much faster at producing code than I am, and reviewing their code - twice over, even - still takes less time than writing it myself.
Reviewing code is often slower than writing it. You don't have to be an exceptionally fast coder or slow reviewer for that to be true.
6 replies →
Are you, though? Reading code is harder, potentially much harder.[1]
And I suspect the act of writing it yourself imparts some lower level knowledge you don't get by skimming the output of an AI.
[1] https://www.joelonsoftware.com/2000/05/26/reading-code-is-li...
1 reply →
But you definitely don't understand it nearly as well as if you wrote it. And you're the one that needs to take responsibility for adding it to your codebase.
In this thread, we pretend that the difficult and time-consuming part of a code review is all the reading you have to do.
then you are a terribly sloppy reviewer
Indeed, instead of writing code to shave a Yak, we're now instead reviewing how the Yak was (most-shittily) shaved.
>So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.
These all sound like your projected assumptions. No, it generally does not take longer to review sizable code changes than it does to write it. This is further alleviated if the code passes tests, either existing or new ones created by the ai.
> Which takes just as long.
This has never once been my experience. Its definitely less fun but it takes way less time.
and probably results in a greater net energy consumption/carbon output
>If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.
Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis. 6 months ago if I wasn’t using the life-changing, newest LLM at the time, I was also doing it wrong and being a luddite.
It creates a never ending treadmill of boy-who-cried-LLM. Why should I believe anything outlined in the article is transformative now when all the same vague claims about productivity increases were being made about the LLMs from 6 months ago which we now all agree are bad?
I don’t really know what would actually unseat this epistemic prior at this point for me.
In six months, I predict the author will again think the LLM products of 6 month ago (now) were actually not very useful and didn’t live up to the hype.
An exponential curve looks locally the same at all points in time. For a very long period of time, computers were always vastly better than they were a year ago, and that wasn't because the computer you'd bought the year before was junk.
Consider that what you're reacting to is a symptom of genuine, rapid progress.
This is true for any curve...
If your curve is continuous, it is locally linear.
There's no use in talking about the curve being locally similar without the context of your window. Without the window you can't differentiate an exponential from a sigmoid from a linear function.
Let's be careful with naive approximations. We don't know which direction things are going and we definitely shouldn't assume "best case scenario"
7 replies →
I don't think anyone's contesting that LLMs are better now than they were previously.
2 replies →
A flatline also looks locally the same at all points in time.
1 reply →
tptacek wasn't making this argument six months ago.
LLMs get better over time. In doing so they occasionally hit points where things that didn't work start working. "Agentic" coding tools that run commands in a loop hit that point within the past six months.
If your mental model is "people say they got better every six months, therefore I'll never take them seriously because they'll say it again in six months time" you're hurting your own ability to evaluate this (and every other) technology.
> tptacek wasn't making this argument six months ago.
Yes, but other smart people were making this argument six months ago. Why should we trust the smart person we don't know now if we (looking back) shouldn't have trusted the smart person before?
Part of evaluating a claim is evaluating the source of the claim. For basically everybody, the source of these claim is always "the AI crowd", because those outside the AI space have no way of telling who is trustworthy and who isn't.
4 replies →
But they say "yes, it didn't work 6 months ago, but it does now", and they say this every month. They're constantly resetting the goal post.
Today it works, it didn't in the past, but it does now. Rinse and repeat.
18 replies →
I stopped paying attention for a few days so I'm way out of date. What is the state of the art for agentic coding now?
I've been using Cline and it can do a few of the things suggested as "agentic", but I'd have no idea how to leave it writing and then running tests in a VM and creating a PR for me to review. Or let it roam around in the file tree and create new files as needed. How does that work? Are there better tools for this? Or do I need to configure Cline in some way?
4 replies →
Have the models significantly improved, or have we just developed new programs that take better advantage of them?
3 replies →
[flagged]
1 reply →
At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?
If you ask different people the above question, and if you vary it based on type of task, or which human, you would get different answers. But as time goes on, more and more people would become impressed with what the human can do.
I don't know when LLMs will stop progressing, but all I know is they continue to progress at what is to me a similar astounding rate as to a growing child. For me personally, I never used LLMs for anything, and since o3 and Gemini 2.5 Pro, I use them all the time for all sorts of stuff.
You may be smarter than me and still not impressed, but I'd try the latest models and play around, and if you aren't impressed yet, I'd bet money you will be within 3 years max (likely much earlier).
> At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?
In this context, never. Especially because the parent knows you will always ask 2+2 and can just teach the child to say “four” as their first and only word. You’ll be on to them, too.
2 replies →
I saw this article and thought, now's the time to try again!
Using Claude Sonnet 4, I attempted to add some better configuration to my golang project. An hour later, I was unable to get it to produce a usable configuration, apparently due to a recent v1-to-v2 config format migration. It took less time to hand-edit one based on reading the docs.
I keep getting told that this time agents are ready. Every time I decide to use them they fall flat on their face. Guess I'll try again in six months.
If you share your conversation (with the share link in Claude) I'd be happy to see if there are any tweaks I can suggest to how you prompted it.
Yes.
I made the mistake of procrastinating on one part of a project thinking "Oh, that is easily LLMable". By God, was I proven wrong. Was quite the rush before the deadline.
On the flip side, I'm happy I don't have to write the code for a matplotlib scatterplot for the 10000th time, it mostly gets the variables in the current scope that I intended to plot. But I've really not had that much success on larger tasks.
The "information retrieval" part of the tech is beautiful though. Hallucinations are avoided only if you provide an information bank in the context in my experience. If it needs to use the search tool itself, it's not as good.
Personally, I haven't seen any improvement from the "RLd on math problems" models onward (I don't care for benchmarks). However, I agree that deepseek-r1-zero was a cool result. Pure RL (plain R1 used a few examples) automatically leading to longer responses.
A lot of the improvements suggested in this thread are related to the infra around LLMs such as tool use. These are much more well organised these days with MCP and what not, enabling you to provide it the aforementioned information bank easily. But all of it is built on top of the same fragile next-token generator we know and love.
> It took less time to hand-edit one based on reading the docs.
You can give it the docs as an "artifact" in a project - this feature has been available for almost one year now.
Or better yet, use the desktop version + a filesystem MCP server pointing to a folder containing your docs. Tell it to look at the docs and refactor as necessary. It is extremely effective at this. It might also work if you just give it a link to the docs.
1 reply →
you can add links to docs to llm agents instead of letting them work blindfolded with hardcoded assumptions
2 replies →
In my experience it's less about the latest generation of LLMs being better, and more about the tooling around them for integration into a programmer's workflow being waaaay better.
The article doesn't explicitly spell it out until several paragraphs later, but I think what your quoted sentence is alluding to is that Cursor, Cline et al can be pretty revolutionary in terms of removing toil from the development process.
Need to perform a gnarly refactor that's easy to describe but difficult to implement because it's spread far and wide across the codebase? Let the LLM handle it and then check its work. Stuck in dependency hell because you updated one package due to a CVE? The LLM can (often) sort that out for you. Heck, did the IDE's refactor tool fail at renaming a function again? LLM.
I'm remain skeptical of LLM-based development insofar as I think the enshitification will inevitably come when the Magic Money Machine breaks down. And I don't think I would hire a programmer that needs LLM assistance in order to program. But it's hard to deny that it has made me a lot more productive. At the current price it's a no-brainer to use it.
It's great when it works, but half the time IME it's so stupid that it can't even use the edit/path tools properly even when given line numbers prepended inputs.
(I should know since I've created half-a-dozen tools for this with gptel. Cline hasn't been any better on my codebase.)
2 replies →
I think they just meant it hit an inflection point. Some people were copying pasting to ChatGPT and saying it was crap and others were using agents that could see the context of the code and worked much, much better. It's the workflow used not just the specific LLM.
This isn't a particularly useful filter, because it applies to many very successful technologies as well. Early automobiles generated a lot of hype and excitement, but they were not very good (unreliable, loud, and dangerous, and generally still worse than horses). They got steadily better until eventually they hit an inflection point where the skeptics were dug in repeating the same increasingly old complaints, while Henry Ford was building the Model T.
name 5 tasks which you think current AIs can't do. then go and spend 30 minutes seeing how current AIs can do on them. write it on a sticky note and put it somewhere that you'll see it.
otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails
Well, I'll try to do a sticky note here:
- they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient
- they fail at doing clean DRY practices even though they are supposed to skim through the codebase much faster than me
- they bait me into inexisting apis, or hallucinate solutions or issues
- they cannot properly pick the context and the files to read in a mid-size app
- they suggest to download some random packages, sometimes low quality ones, or unmaintained ones
22 replies →
The problem with AI hype is not really about whether a particular model can - in the abstract - solve a particular programming problem. The problem with AI hype is that it is selling a future where all software development companies become entirely dependent on closed systems.
All of the state-of-the-art models are online models - you have no choice, you have to pay for a black box subscription service controlled by one of a handful of third-party gatekeepers. What used to be a cost center that was inside your company is now a cost center outside your company, and thus it is a risk to become dependent on it. Perhaps the risk is worthwhile, perhaps not, but the hype is saying that real soon now it will be impossible to not become dependent on these closed systems and still exist as a viable company.
> name 5 tasks which you think current AIs can't do.
For coding it seems to back itself into a corner and never recover from it until i "reset" it .
AI can't write software without an expert guiding it. I cannot open a non trivial PR to postgres tonight using AI.
1 reply →
1. create a working (moderately complex) ghidra script without hallucinating.
Granted I was trying to do this 6 months ago, but maybe a miracle has happened. But I'm the past I had very bad experience with using LLMs for niche things (i.e. things that were never mentioned on stackoverflow)
2 replies →
Everyone keeps thinking AI improvement is linear. I don't know if this is correct, but it's just my basic impression that the current AI boost came from instead of being limiting yourself to the CPU and its throughput adding the massive amount of computing power in graphics cards.
But for each nine of reliability you want out of llms everyone's assuming it's just a linear growth. I don't think it is. I think it's polynomial at least.
As for your tasks and maybe it's just cuz I'm using chat GPT, but I asked it to Port sed, something with full open source code availability, tons of examples/test cases, a fully documented user interface and I wanted it moved to Java as a library.
And it failed pretty spectacularly. Yeah it got the very very very basic functionality of sed.
1 reply →
If AI can do anything, why can't I just prompt "Here is sudo access to my laptop, please do all my work for me, respond to emails, manage my household budget, and manage my meetings".
I've tried everything. I have four AI agents. They still have an accuracy rate of about 50%.
Make me a million dollars
Tell me about this specific person who isn't famous
Create a facebook clone
Recreate Windows including drivers
Create a way to transport matter like in Star Trek.
I'll see you in 6 months.
Also, professional programmers have varying needs. These people are coding in different languages, with varying complexity, domains, existing code bases and so on.
People making arguments based on sweeping generalizations to a wide audience are often going to be perceived as delusional, as their statements do not apply universally to everyone.
To me, thinking LLMs can code generally because you have success with them and then telling others they are wrong in how they use them is making a gigantic assumptive leap.
I just assume every blog post in HN starts with “As a web dev, TITLE”
> Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis.
Dude, just try the things out. It's just undeniable in my day-to-day life that I've been able to rely on Sonnet (first 3.7 and now 4.0) and Gemini 2.5 to absolutely crush code. I've done 3 side projects in the past 6 months that I would have been way too lazy to build without these tools. They work. Never going back.
Why can't reviews of AI be somewhere in the middle between "useless" and "the second coming"?
I tried Copilot a few months ago just to give it a shot and so I could discuss it with at least a shred of experience with the tool, and yea, it's a neat feature. I wouldn't call it a gimmick--it deserves a little more than that, but I didn't exactly cream my pants over it like a lot of people seem to be doing. It's kind of convenient, like a smart autocomplete. Will it fundamentally change how I write software? No way. But it's cool.
1 reply →
Bullshit. We have absolute numbers, not just vibes.
The top of SWE-bench Verified leaderboard was at around 20% in mid-2024, i.e. AI was failing at most tasks.
Now it's at 70%.
Clearly it's objectively better at tackling typical development tasks.
And it's not like it went from 2% to 7%.
Isn't SWE-bench based on public Github issues? Wouldn't the increase in performance also be explained by continuing to train on newer scraped Github data, aka training on the test set?
The pressure for AI companies to release a new SOTA model is real, as the technology rapidly become commoditised. I think people have good reason to be skeptical of these benchmark results.
1 reply →
>We imagine artists spending their working hours pushing the limits of expression. But the median artist isn’t producing gallery pieces. They produce on brief: turning out competent illustrations and compositions for magazine covers, museum displays, motion graphics, and game assets.
One of the more eye-opening aspects of this technology is finding out how many of my peers seemingly have no understanding or respect for the concept of art.
How do you mean?
Whole libraries have been written over millennia about the importance and purpose of art, and that specific quote reduced it all down to nothing more than the creation of a product with a specific and mundane function as part of some other product. I genuinely feel bad for people with that mindset towards art.
16 replies →
The argument that programmers are into piracy and therefore should shut up about theft is nonsensical. Not defending piracy, but at least an artist or creator is still credited and their work is unadulterated. Piracy != plagiarism.
It's also ignoring the fact that much plagiarized code is already under permissive licenses. If Star Wars or Daft Punk were CC-BY-SA nobody would need to pirate them, and there may even be a vibrant remix culture... which is kind of the whole point of open source, is it not?
These LLMs don't respect those permissive licenses, though. Especially the GPL, but even MIT requires attribution through inclusion of a copyright notice.
1 reply →
I'll add it's not true for programmers with morality. For instance, people who follow Jesus Christ are commanded to obey the law and treat people right. Many think respecting copyright is both. While I oppose copyright law, I do uphold it and like supporting content creators.
Also, I think there's an argument similar to cryptocurrency companies that run like pyramid schemes. I could've made easy money doing security work for them. Yet, I felt like I'd be participating in helping them rob people or advancing their con. (Some jobs, like building assurance tools, might be OK.) Likewise, using tools built on massive, copyright infringement might be supporting or promoting that.
So, I gotta use legally-trained models or wait for legal reforms that make LLM training legal. Especially the data sets they distribute which is currently illegal, file sharing.
[flagged]
> "For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields."
You've already lost me, because I view programming as an art form. I would no more use AI to generate code than I would use it to paint my canvas.
I think the rest of the article is informative. It made me want to try some things. But it's written from the perspective of a CEO thinking all his developers are just salt miners; miners go into the cave and code comes out.
I think that's actually what my hangup is. It's the old adage of programmers simply "copying and pasting from stack overflow" but taken to the extreme. It's the reduction of my art into mindless labor.
People don't pay programmers to produce great art. No one sees that "art" and no one cares. They pay programmers to get shit done.
A functional code that is easy to maintain is art (but you have to be an experienced programmer to see it). A shoddy project isn’t, but the whole company feels the pain.
Wrong.
I do pay my general contractor for the unseen quality that goes into the structure of my home. Programs should be built the same way.
Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.
> Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.
How many furniture makers did you talk to forming this opinion? The metaphor does not line up with either my software of furniture experience. I work with production furniture shops that choose not to use CNCs to avoid the soul being sucked out of the work. This is not a rare stance to take and this is not "japanese joinery" woodworking. This is real work, balancing the means of production with optimal quality. There is all sorts of arguments on whether cncs or using a domino or whatever is "real" woodworking, but the idea that this choice of quality does not exist in woodworking and so we shouldn't have it in software is not my experience.
12 replies →
I don't think that analogy matches very well. Most software is bespoke, the domain requirements, usage aspects, and architectural trade-offs are subtly, or often non-subtly, different each time, and take different trajectories over time. It's not like you're producing the same software 10,000 times, like a piece of furniture. And AI isn't able to produce the exact same thing reproducibly anyway. A better argument would be that AI is actually approaching the craftsmanship/artisanal capabilities.
2 replies →
I'm not sure why you're equivocating "all code" with "CRUD apps". If that were the case I'd be with you. But it isn't. Glue code between APIs is indeed pointless garbage that can be automated, but that's not what I, nor many other engineers, are writing day-to-day.
That's cause there's an element of mindless labour to it. It's easier to spot that so it gets more focus.
If you find that theres an element of mindless labor to coding then you're probably doing it wrong.
4 replies →
I'm sure salt miners needed to make peace with their toil and also focused on tools and techniques to be more productive; how to remove the salt most elegantly in nice clean blocks, minimize waste, reduce burden on their physical bodies.
But to their bosses their output was salt.
I'm sorry but unless you're working in open source for the pure love of the tech/craft, the output of software engineering is PROBLEM SOLVING.
That's why "build vs. buy" exists - sometimes it's better to buy a solution than buy one. That's why a valid solution to a problem sometimes is to convince a customer that their ask is wrong or unreasonable, and something simpler or easier would get them 99% of what they need with 1% of the effort.
That's our job.
Sure, for a large portion of our industry, the goal is to hoover up as much user data as cheaply as possible. Being responsible with that data isn't part of that "PROBLEM SOLVING."
> If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing.
This sounds like the "No true Scotsman" fallacy.
> People coding with LLMs today use agents. Agents get to poke around your codebase on their own.
That's a nonstarter for closed source, unless everything is running on-device, which I don't think it is?
> Part of being a senior developer is making less-able coders productive
Speak for yourself. It's not my job.
You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk), or Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality. I work in aerospace, one of the most paranoid industries, and even we got a Copilot subscription that met our needs...
> You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk)
How does that work exactly? Do you have a link?
> Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality
These companies hoovered up all of our content without notice, permission, or compensation, to train their models. I wouldn't trust them one bit. My personal opinion is that it's foolish to trust them.
> I work in aerospace, one of the most paranoid industries
Paranoid about what exactly?
4 replies →
Who are these friends? Are they in the room with us right now? Look, maybe my experience is atypical but I’m an AI skeptic and I know plenty of others. I’ve never heard people claim that LLMs are a fad or going to go away.
I’ve seen lots of people:
* think that conflating LLMs and “AI” produces a lot of poorly reasoned arguments
* doubt the economic narratives being built around LLM technology
* think the current rate of progress in the technology is basically flat
* think most “AI companies” resemble most crypto companies
An addendum to the last point: very few crypto skeptics deny that BitCoin is a thing or think it’s going away, either. It’s just strawmanning.
These are 4 arguments that have literally nothing to do with what I wrote.
* I don't care what "AI" is. There's a section in the post about this. Nobody needs to update their religious beliefs because of LLMs.
* I don't care about the economic futures of Nvidia or the frontier model companies. I'm not recommending you invest in their stock.
* The article opens with "all progress in LLMs could halt and this would still apply".
* The article doesn't care about "AI companies".
I'm a crypto skeptic and I do deny Bitcoin is a long-term thing, for what it's worth. I'm a hard skeptic of cryptofinance.
As I say, those are the arguments that I actually hear people make. And most of the arguments I'm hearing people make are from people who are using LLM tech on a daily basis (not the agent stuff so much, although it sounds promising). I remain surprised you're encountering so many people quite so skeptical.
The only real dispute about the usefulness of the current technology I hear is the concern that this will leave us without devs who understand the code at a deep level.
1 reply →
[flagged]
Like you said, nobody reasonable is denying that something real is happening - the question is more about how useful, how soon, and for whom
Not all hype is just hot air.
Case in point: WWW and the “dot com” bubble. Dot com was a bubble, but the WWW (and Internet) was a big thing that transformed all major societies. It survived the bust that followed after the boom.
Are LLMs here to stay? We can guess, but as usual, time will tell.
It's like the problem of half-full vs half-empty. We see LLM can handle certain tasks, but counter-examples are too far from being rare. So, "LLM can do A" is always followed by "LLM fails at A".
The problem is that LLM never performs consistently. It works when it works. It doesn't when It doesn't. No one knows exactly why, and no one can tell when it's gonna fail. For example, even to this day, GPT sometimes gives me wrong calculations, even when it is instructed to use calculator for that. Who knows why it ignores the instruction, nor why it can't reliably perform the addition of two integers. That really screws up with the automation.
Anyways, I'm really tired of skeptic-skeptics. I hate some people believe "half-full" is genuinely better than "half-empty". I refuse that idea completely. It's more about which context you're in. If you need exactly a cup of water, it's half-empty. If you are lucky to have some water, it's half-full. If you have a clear direction you want to go, you discover a lot of deficiencies. If you're there just enjoying the moment, yeah, you can just keep enjoying the new toy.
> Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades. We’re not East Coast dockworkers; we won’t stop progress on our own.
If you really believe in the power of LLMs then it’s time to wake up. The writing is on the wall. Automation the workforce further into precarious jobs.
The idea that tech workers can’t stop so-called “progress” is at best a dumb self-fulfilling prophecy. Our workplaces depend on us. We have the power to put the brakes on whatever our employers are up to by organizing and striking.
Tech workers should be organizing to prepare for the profit-taking moves management has in store for us as the tech gets better and better. If LLMs really live up to their potential, It’s just going to get worse from here.
They keep bragging about how many people are going to lose their jobs and they mean us. They’re bragging about firing us! It’s foolish for us to sit idly by while we are the ones who make them their profits.
> We have the power to put the brakes on whatever our employers are up to by organizing and striking.
You have the power to do that in your own isolated environment, but it's a delaying tactic at best.
This whole thing is as inevitable as TV, smartphones, airplanes, tanks or guns. Everyone will use them, because they're so much better than anything that came before that there's no competition for the same use cases.
The point isn’t to delay the tech from arriving but to have better negotiating power when it fully arrives.
> Tech workers should be organizing to prepare for the profit-taking moves management has in store for us
I think you think this is going to help tech workers, but raising the cost of employing humans is only going to incentivize companies to adopt AI faster.
You should do the opposite, and tax AI use. Though that's probably rife with potential issues too. I think as umemployment increases we're going to have to make a bad decision somewhere... and I'm guessing taxing AI is the lesser evil.
Historically many unions were formed during times of mass automation. Unions don’t unilaterally drive up wages. They increase workers’ negotiating power. This is needed to help workers best navigate the times ahead.
The time for tough decisions is today. There is no interest from the companies that control our politicians in supporting taxes, basic income, or whatever other policy proposal people think will occur after AI. Worker leverage will only be diminished after mass layoffs. We will end up in a situation of social unrest. To pull out of that workers will be forced to organize to increase negotiating power. Sooner is better than later, if we can realize it’s in our interests to do so.
This is the cost of getting used to outrageously high salaries as compared to other engineering fields. Market forces allowed you to inflate your income beyond reason, now the same forces are taking it away. Turns out decoupling your pay from value delivered wasn't such a good idea after all.
The fact that companies pay these prices shows the money is there. The money is there because of the value these workers create.
2 replies →
If the tech is there dragging your feet is not going to buy you too much time.
If work can meaningfully be done by a machine, why should we be fighting against that? For jobs? If life is all about doing pointless work a machine can do, we've got problems as a species.
The problem is capitalism not LLMs. Fighting against this progress just leaves us stagnant as a species.
Perhaps AI will lead to the revolution where automation frees the average person from having to toil in a factory or a cubical farm instead of trying to achieve infinite growth in a closed system.
I was recently "pair-vibe-coding" with someone who's much 'smarter' than I am, certainly when it comes to coding, especially Python. He's always been in the LLM skeptic camp, and it was fascinating to see that because of his extensive Python knowledge, his prompting was actually very minimal and weak, one might even say 'lazy'. The output he got from o3 for our given task was therefore really mediocre, and had a few hallucinations (which could've been avoided if he spent a few more seconds or minutes prompting).
I, knowing far less than him, would've had a much more elaborate prompt, and o3 would've proved a lot more competent/capable. Yet with my friend, since he knows so much already, and has such a high bar, he thinks the AI should be able to do a lot more with just a few basic words in a prompt... yet, for that same reason, he (understandably) doubts the inevitable sub-par output.
That's what makes all these debates about "Why are smart people doubting LLMs??" so pointless. The smarter you are, the less help you need, so the less prompting you do, the less context the model has, the less impressive the output, and the more the smart person thinks LLMs suck. With this logic, of course the smartest people are also the biggest skeptics!
I doubt this holds true generally. The smart coders I know who are also LLM users generally develop a decent intuition for what the models are good and bad at, and how to steer them into good performance.
Then perhaps my friend has remained a skeptic for so long that he's atrophied in this regard (which OP's post touches on). Either way, most of his day job is as a CTO/manager at a startup, so he's not in the weeds coding as much anymore in the first place. I should've seen how he prompts LLMs for managerial tasks, then I'd know whether his 'prompt laziness' was systemic or influenced by his coding knowledge.
Throwing my two cents in here...I think there's a disconnect between what AI advocates want, and what everyone else wants.
The arguments against genAI tend to point out things like: 1. Its output is unreliable at best 2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes 3. The process automates away a task that many people rely on for income
And the response from genAI advocates tends to be dismissive...and I suspect it is, in part, because that last point is a positive for many advocates of genAI. Nobody wants to say it out loud, but when someone on Reddit or similar claims that even a 10% success rate outweighs the 90% failure rate, what they mean is most likely "A machine that works 10% of the time is better than a programmer who works 60-80% of the time because the machine is more than 6-to-8-times cheaper than the programmer".
There's also the classic line about how automation tends to create more jobs in the future than it destroys now, which itself is a source of big disconnects between pro-genAI and anti-genAI crowds--because it ignores a glaring issue: Just because there's gonna be more jobs in the future, doesn't mean I can pay rent with no job tomorrow!
"You can write an effective coding agent in a week" doesn't reassure people because it doesn't address their concerns. You can't persuade someone that genAI isn't a problem by arguing that you can easily deploy it, because part of the concern is that you can easily deploy it. Also, "you’re not doing what the AI boosters are doing" is flat-out incorrect, at least if you're looking at the same AI boosters I am--most of the people I've seen who claim to be using generated code say they're doing it with Claude, which--to my knowledge--is just an LLM, albeit a particularly advanced one. I won't pretend this is anything but anecdata, but I do engage with people who aren't in the "genAI is evil" camp, and...they use Claude for their programming assistance.
"LLMs can write a large fraction of all the tedious code you’ll ever need to write" further reinforces this disconnect. This is exactly why people think this tech is a problem.
The entire section on "But you have no idea what the code is!" falls apart the moment you consider real-world cases, such as [CVE-2025-4143](https://nvd.nist.gov/vuln/detail/cve-2025-4143), where a programmer who is a self-described expert working with Claude--who emphasizes that he checked over the results with a fine-toothed comb, and that he did this to validate his own skepticism about genAI!--missed a fundamental mistake in implementing OAuth that has been common knowledge for a long while. The author is correct in that reading other people's code is part of the job...but this is difficult enough when the thing that wrote the code can be asked about its methods, and despite advances in giving LLMs a sort of train of thought, the fact remains that LLMs are designed to output things that "look truth-y", not things that are logically consistent. (Ah, but we're not talking about LLMs, even though kentonv tells us that he just used an LLM. We're talking about agentic systems. No true AI booster would "just" use an LLM...)
I actually agree with the point about how the language can catch and point out some of the errors caused by hallucination, but...I can generate bad function signatures just fine on my own, thank you! :P In all seriousness, this addresses basically nothing about the actual point. The problem with hallucination in a setting like this isn't "the AI comes up with a function that doesn't exist", that's what I'm doing when I write code. The problem with hallucination is that sometimes that function which doesn't exist is my RSA implementation, and the AI 'helpfully' writes an RSA implementation for me, a thing that you should never fucking do because cryptography is an incredibly complex thing that's easy to fuck up and hard to audit, and you really ought to just use a library...a thing you [also shouldn't leave up to your AI.](https://www.theregister.com/2025/04/12/ai_code_suggestions_s...) You can't fix that with a language feature, aside from having a really good cryptography library built into the language itself, and as much as I'd love to have a library for literally everything I might want to do in a language...that's not really feasible.
"Does an intern cost $20/month? Because that’s what Cursor.ai costs," says the blog author, as if that's supposed to reassure me. I'm an intern. My primary job responsibility is getting better at programming so I can help with the more advanced things my employer is working on (for the record, these thoughts are my own and not those of my employer). It does not make me happy to know that Cursor.ai can replace me. This also doesn't address the problem that, frankly, large corporations aren't going to replace junior developers with these tools; they're going to replace senior developers, because senior developers cost more. Does a senior engineer cost 20 dollars a month? Because that's what Cursor.ai costs!
...and the claim that open source is just as responsible for taking jobs is baffling. "We used to pay good money for databases" is not an epic own, it is a whole other fucking problem. The people working on FOSS software are in fact very frustrated with the way large corporations use their tools without donating so much as a single red cent! This is a serious problem! You know that XKCD about the whole internet being held up by a project maintained by a single person in his free time? That's what you're complaining about! And that guy would love to be paid to write code that someone can actually fucking audit, but nobody will pay him for it, and instead of recognizing that the guy ought to be supported, you argue that this is proof that nobody else deserves to be supported. I'm trying to steelman this blogpost, I really am, but dude, you fundamentally have this point backwards.
I hope this helps others understand why this blogpost doesn't actually address any of my concerns, or the concerns of other people I know. That's kind of the best I can hope for here.
> 1. Its output is unreliable at best
> 2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes
The thing is this is true of humans too.
I review a lot of human code. I could easily imagine a junior engineer creating CVE-2025-4143. I've seen worse.
Would that bug have happened if I had written the code myself? Not sure, I'd like to think "no", but the point is moot anyway: I would not have personally been the one to write that code by hand. It likely would have gone to someone more junior on the team, and I would have reviewed their code, and I might have forgotten to check for this all the same.
In short, whether it's humans or AI writing the code, it was my job to have reviewed the code carefully, and unfortunately I missed here. That's really entirely on me. (It's particularly frustrating for me as this particular bug was on my list of things to check for and somehow I didn't.)
> 3. The process automates away a task that many people rely on for income
At Cloudflare, at least, we always have 10x more stuff we want to work on then we have engineers to work on it. The number of engineers we can hire is basically dictated by revenue. If each engineer is more productive, though, then we can ship features faster, which hopefully leads to revenue growing faster. Which means we hire more engineers.
I realize this is not going to be true everywhere, but in my particular case, I'm confident saying that my use of AI did not cause any loss of income for human engineers, and likely actually increased it.
I mean, fair. It's true that humans aren't that great at writing code that can't be exploited, and the blogpost makes this point too: between a junior engineer's output and an LLM's output, the LLM does the same thing for cheaper.
I would argue that a junior engineer has a more valuable feature--the ability to ask that junior engineer questions after the fact, and ideally the ability to learn and eventually become a senior engineer--but if you're looking at just the cost of a junior engineer doing junior engineer things...yeah, no, the LLM does it more efficiently. If you assume that the goal is to write code cheaper, LLMs win.
However, I'd like to point out--again--that this isn't going to be used to replace junior engineers, it's going to be used to replace senior engineers. Senior engineers cost more than junior engineers; if you want each engineer to be more productive per-dollar (and assume, like many shareholders do, that software engineers are fungible) then the smart thing to do is replace the more costly engineer. After all, the whole point of AI is to be smart enough to automate things, right?
You and I understand that a senior engineer's job is very different from a junior engineer's job, but a stockholder doesn't--because a stockholder only needs to know how finance works to be a successful stockholder. Furthermore, the stockholder's goal is simply to make as much money as possible per quarter--partly because he can just walk out if the company starts going under, often with a bigger "severance package" than any of the engineers in the company. The incentives are lined up not only for the stockholder to not know why getting rid of senior engineers is a bad idea, but to not care. Were I in your position, I would be worried about losing my job, not because I didn't catch the issue, but because
Aside: Honestly, I don't really blame you for getting caught out by that bug. I'm by no means an expert on anything to do with OAuth, but it looks like the kind of thing that's a nightmare to catch, because it's misbehavior under the kind of conditions that are--well, only seen when maliciously crafted. If it wasn't something that was known about since the RFC, it would probably have taken a lot longer for someone to find it.
2 replies →
[dead]
[flagged]
Could you please stop breaking the site guidelines, as you did here and in https://news.ycombinator.com/newsguidelines.html
[flagged]
It's interesting also how these takes consistently ignore spectacularly the environmental cost of these as well.
I'm a bit on the fence myself, as I think it's very harmful, but I can also see ways it can be useful. But it's absolutely mindblowing how this is nearly always completely out of the discussion even though our current way of living and powering things is on a timer and we still haven't addressed it as a whole.
Thanks for posting this! I was also flabbergasted and it's nice to see at least one mention of the huge environmental impact!
Do you know what the environmental footprint of a software developer is?
I’m only half joking.
the “environmental costs” argument is based upon like fifteen false assumptions. by 2030, AI will be running almost entirely on solar+wind+battery, because that is by far the cheapest option
3 replies →
What's the environmental cost a human doing the same thing?
1 reply →
What environmental cost? The uranium is going to decay on its own anyway, we might as well use that energy for useful things.
Maybe LLMs will help lead to a breakthrough in nuclear fusion or battery research.
Degrowth is the philosophy of poverty, and a poverty of philosophy.
4 replies →
Yes pretty much spot on. And if he thinks that
> they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.
Then I really hope he's in for a surprise.
Rage bait and little more.
your opinion on that sidebar issue.
I for one am glad not everybody writes their op-eds while cosplaying as an HR-drone writing press releases.
False dichotomy. One can at the same time write not as an HR drone, and also not as an aggressive jerk.
[dead]
in re: "but the plagiarism"
As someone who thinks modern copyright law is unfit for purpose, and more specifically that software shouldn't be copyrightable at all, my main complaint about LLMs is the same as Chinese spying and IP theft: I only care because they don't share.
I'm perfectly fine using open-weights LLMs[0] to do things LLMs are good at. But I would never trust using OpenAI or Perplexity's SaaSS models as a replacement for my own skills. Half the reason they put things behind a server is so they can spy on you[1], the other half is so they can promise a revenue stream to the people they plan to sell the company to so they can enshittify it. The only enshittification-resistant software is software people can't charge you for.
All the other things this blog post is complaining about are real reactions to real harms, even in the FOSS community. People use LLMs as skilled labor substitute and get burned, while creating a spam problem for everyone else. In fact, that's the real replacement threat for artists: diffusion models don't so much outperform human artists as much as they convince people not to pay for art.
[0] I don't give two flying fucks about the license terms.
[1] The judge in the NYT lawsuit is even ordering spying for evidentiary reasons
[dead]
We are beyond the point of trying to convince naysayers.
I will simply not hire anybody who is not good at using LLMs, and I don't think I would ever work with anybody who thinks they aren't very useful. It's like working with somebody who things compilers are useless. Obviously wrong, not worth spending time trying to convince.
To anyone who reads this article and disagrees with the central point: You are missing the most important thing that will happen in your career. You should reevaluate because you will be unemployable in a few years.
I don't think most people with mixed feelings in LLMs (or heretic naysayers as you put it) would want to work in a place like that, so perhaps you are doing everyone a favour!
I think this is a reasonable response. But I also think it's worth taking the parent's compiler analogy seriously as a thought experiment.
Back when I was in college in the 00s, if I had developed a preference for not using compilers in my work, I might have been able to build a career that way, but my options would have been significantly limited. And that's not because people were just jerks who were biased against compiler skeptics, or evil executives squeezing the bottom line, or whatever. It's because the kind of software most people were making at that period of time would have been untenable to create without higher level languages.
In my view, we clearly aren't at this point yet with llm-based tooling, and maybe we never will be. But it seems a lot more plausible to me that we will than it did a year or even six months ago.
1 reply →
It reminds me of many of the people I worked with early in my career.
They were opposed to C++ (they thought C was all you need), opposed to git (they used IBM clearcase or subversion), opposed to putting internal tools in a web browser (why not use Qt and install the tool), opposed to using python or javascript for web services (it's just a script kiddie language), opposed to sublime text/pycharm/vscode (IDEs are for people who don't know how to use a CLI).
I have encountered it over and over, and each time these people get stuck in late career jobs making less than 1/3 of what most 23 year old SWEs I know are making.
5 replies →
> You should reevaluate because you will be unemployable in a few years.
When this happens, then I'll figure out how to get good at prompting.
[flagged]
> Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.
Ughh. That "own time." Spoken like a true middle manager who thinks passion is a liability.
The author also wrote
“Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed.”
This tells you all you need to know about the kind of work the author values.
Next time you wonder why so much software is buggy, slow, half considered slop, you’ll know exactly who it was made by.
As I recall, fly.io was going down an awful lot last time I peeked in.
[dead]
[dead]
And this goes at least part of the way towards explaining why Fly.io has been the second least reliable cloud provider I've ever used, after Azure.
"Don't be snarky."
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
https://news.ycombinator.com/newsguidelines.html
I apologize. I genuinely did not mean snark, nor to be dismissive. It was a sincere honest-to-god "oh wow this actually explains the situation over there" mixed with an expression of frustration with their service quality.
A lot of their pages barely work and feel like first drafts, much of the UI is even labeled as a work in progress, and manytools regularly just fail to do what they're supposed to.
Deployments are routinely broken, with Twitter lit up like a light bulb about it while their status page remains all services operational.
I deploy three completely separate applications with Fly.io and have regular unscheduled downtime.
I stick with them despite all this because they're cheap and easily scale regular Go apps down to zero which is wonderful for apps that see intermittent/irregular usage patterns.
The fact that they are adamant defenders of vibe coding going so far in the post as to say "who cares about hallucinations" genuinely explains a fair deal about their unprofessional feeling service.
quick, the address of a burn clinic
Numerous words to make a very small point. Also, a lot of colloquial words used that made it difficult to comprehend the sentences.
This reads even more like an angry teenager than my angsty high school diary. I'm not sure how many more strawmans and dismissive remarks I can handle in one article.
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."
https://news.ycombinator.com/newsguidelines.html
Is there a similar rule for submissions, or are submitters exempt from adopting HN culture? "Please don't submit shallow arguments, specially criticizing other people's work?"
Because we've recently been getting a series of low quality submissions which obviously drive low quality discussion, but for some reason it's only the commenters who get reprimanded, as if the submissions weren't ruining the mood one by one.
(And to clarify, I haven't been warned, I'm not writing this out of spite.)
1 reply →
The language in this post is just terrible.
[flagged]
[dead]
[flagged]
[flagged]
[flagged]
[flagged]
tptacek has been on HN forever.
[flagged]
To your first point, I think it’s just a distinction our industry hasn’t figured out yet. There are some stages in projects that some people just don’t like. Similar to how you have rough carpentry vs finish. Some like building the structure, others like the details and polish.
> Interestingly, I've noticed that a lot of the people I meet that both find coding tedious and are extremely excited by AI coding, are generally over 40.
I don't think it's the coding that they find tedious.
I think you missed the part that's not exciting. When you write your 50th webapp and start integrating auth flows once again that just work slightly differently in this new app than your older one. That's what boils the blood, not writing new business logic.
Right, I guess that's what I mean I've side-stepped so far. I don't generally work on anything web related and haven't had to repeatedly work on similar projects over and over again. I've maybe just misunderstood what most other developers actually work on in the industry.
Anecdatum, but I'm over 40 and do not find coding tedious and am not excited by AI coding per se.
Likewise.
I love coding. I’m not excited by AI coding but I am curious. I find it useful in quite limited ways.
I have recently started an Ansible project where Aider helped me get off the ground and expand a little bit. Then it made a mess out of it so some other day I’ll have to find the time to start over.
It was useful though, and I’ll use some of the bits created by the LLM to kickstart it in anger next time around.
but I also work on interesting and varied stuff, and not web front end/backend du jour garbage
1 reply →
fly.io is of course closed source, right?
so we can't look at the code and see the demonstrable engineering excellence
trust me bro, you fool, you poltroon,
Their client application (written in Go) is open source: https://github.com/superfly/flyctl
They have a few other notable open source components, I think these two are particularly interesting:
- https://github.com/superfly/corrosion
- https://github.com/superfly/litefs
(Disclosure: they sponsor some of my work.)
I am pretty exhausted with reading the cynical takes on AI lately. Like, I get it, but ugh.
Yes, the tech isn't perfect yet. Yes, it means existing industries will be displaced. Yes, it threatens to undermine the identities we have built as useful artists, programmers, data analysts, whatever. It is a scary, uncertain future, and on some days I am overcome with dread about what it means for me.
But ffs, we will figure it out. This technology, if it continues to improve at the pace it has, will open the door to us solving previously intractable problems - problems like curing cancer, endless energy and nuclear fusion, making us 100x more productive at building great housing and infrastructure. Let's get on board and make this thing a success, not fight it every step of the way because we want to preserve some basically arbitrary state of society.
"AI is Amazing" - People who barely write any code.
A writing for the ages. I've found most of the LLM skeptics are either being hypocritical or just being gate-keepy (we dont want everyone to write code)
I find the AI proponents have an insane level of egocentrism
They cannot possibly imagine someone has a different use case where the AI didn't work
"I crank out shitty webapps all day, therefore every single other dev does. Everyone obviously has the same use case as me because I am the center of the universe"
THANK YOU.
I was a 3-4x programmer before. Now I’m a 9-15x programmer when wrangling LLMs.
This is a sea change and it’s already into “incredible” territory and shows no signs of slowing down.
> Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.
I have been banging out little projects that I have wanted to exist for years but always had on the back burner. Write a detailed readme and ask the agent to interrogate you about the missing parts of the spec then update the README. Then have it make a TODO and start implementing. Give it another code base for style guide.
I’ve made more good and useful and working code in the last month than I have in the last two years.
That’s nothing, I was a 4X programmer and now I’m a 500x programmer!
I don’t just run one agent, I run all of them!
My time to close tickets is measured in minutes!
I don’t even review code, I have a different agent review it for me!
And to make sure that agent doesn’t make mistakes I have a different agent review that agents work!
6 replies →
I can’t tell if you’re being sarcastic or not, but if you are, the real world is not far behind. I can imagine a world where a mixture of AI agents (some doing hypercritical code review) can return you tested and idiomatic PRs faster than you can describe the new architecture in issues.
I think a lot of people are unfamiliar with the (expensive) SOTA.
1 reply →
I'm not sure about giving specific metrics or kpis of efficiency or performance
It definitely feels different to develop using LLMs, especially things from scratch. At this point, you can't just have the LLM do everything. Sooner or later you need to start intervening more often, and as the complexity of the project grows, so does the attention you need to give to guiding the LLM. At that point the main gains are mostly in typing and quickly looking some things up, which are still really nice gains
I hate how the discourse around LLM-assisted programing is so polarized. It's either detractors saying it's "a fad that's useless and going nowhere, wasting billions of megawatts every year" or it's true believers calling it "the most miraculous sea change technological advancement in my lifetime" or "more important than fire and electricity[1]." There just doesn't seem to be any room in the middle.
I tried out Copilot a few months back to see what all the fuss was about and so that I could credibly engage with discussions having actually used the technology. I'd rate it as "kind of neat-o" but not earth shattering. It was like the first time I used an IDE with auto-complete. Oh, cool, nice feature. Would I pay monthly for it? No way. Would I integrate it into my development workflow if it were free? Maybe, I guess? Probably wouldn't bother unless it came literally set up for me out of the box like autocomplete does nowadays.
Don't get me wrong--it's cool technology. Well done, AI people. Is it "the 2nd most important thing to happen over the course of my career" as OP wrote? Come on, let's come down to earth a little.
1: https://www.cnbc.com/2018/02/01/google-ceo-sundar-pichai-ai-...
I don't know. I think 9-12 months ago I'd agree with you. But I feel like the last 6 months my productivity has vastly improved. Not only that, it's also brought back a little bit of passion for the field.
It's easy to come up with some good ideas for new project, but then not want to do a lot of the garbage work related to the project. I offload all that shit to the LLM now.
Seriously, the LLMs have increased my productivity 2-4x.
Copilot is a bad yardstick. The article literally addresses exactly this. It’s not just “cool technology”, that’s the point. It enables things that were previously impossible.
I spent $600 on claude via cursor last month and it was easily worth 2-3x that.
2 replies →
> I was a 3-4x programmer before. Now I’m a 9-15x programmer
What the fuck does this mean?
It means cranking out hello world even faster i guess. I wonder how complex all these projects are people are proud to have completed with the help of AI.
2 replies →
Nerds got taken aside and talked to about how it's not nice or cool to brag about IQ score so they invented a new artificial metric to brag about.
It depends on the value of x. I think it's safe to assume x <= 0.75, else they'd contribute negatively to their teams (happens from time to time, but let's be generous). Previously they'd be anywhere from a 0/10 to 3/10 programmer, and now they get up to 9/10 on a good day but sometimes are a net negative, as low as -2.25/10 on a bad day. I imagine that happens when tired or distracted and unable to adequately police LLM output.
It’s a riff on the “10x programmer” concept. People who haven’t worked with 10x programmers tend to not believe they exist.
I’m nowhere near that, but even unaided I’m quite a bit faster than most people I’ve hired or worked with. With LLMs my high quality output has easily tripled.
Writing code may be easier than reading it - but reading it is FASTER than writing it. And that’s what matters.
0x3 and 0x15 is the same value.