The insecure evangelism of LLM maximalists

2 days ago (lewiscampbell.tech)

This doesn't feel completely right.

Simon Wilson (known for Django) has been doing a lot of LLM evangelism on his blog these days. Antirez (Redis) wrote a blog post recently with the same vibe.

I doubt they are not good programmers. They are probably better than most of us, and I doubt they feel insecure because of the LLMs. Either I'm wrong, or there's something more to this.

edit: to clarify, I'm not saying Simon and Antirez are part of the hostile LLM evangelists the article criticizes. Although the article does generalize to all LLM evangelists at least in some parts and Simon did react to this here. For these reasons, I haven't ruled him out as a target of this article, at least partly.

  • The author claims it's not just that one evangelizes it, but that they become hostile when someone claims to not have the same experience in response. I don't recall Either Willison or Antirez scaring people by saying they will be left behind or that they are just afraid of becoming irrelevant. Instead they just talk about their positive experiences using it. Willison and Antirez seem to be fine to live and let live (maybe Antirez a bit less, but they're still not mean about it).

    • My gut says that is not a property of LLM evangelists, but a property of current internet culture in general. People with strong, divisive, and engaging opinions seem to do well (by some definition of well) online.

      5 replies →

    • I think the actual problem is everyone tries to assert how capable or not coding agents currently are, but how useful they are depends so much on what you are trying to get them to do and also on your communication with the model. And often its hard to tell whether you're just prompting it wrong or if they're incapable of doing it.

      By now we at least agree that stochastical parrots can be useful. It would be nice if the debate now was less polarized so we could focus on what makes them work better for some and worse for others other than just expectations.

    • Thanks for clarifying for people.

      And yeah, as I laid out in the article (that of course, very few people actually read, even though it was short...), I really don't mind how people make code. It's those that try so hard to convince the rest of us I find very suspect.

      3 replies →

  • You seem to be mistaking set and members here. The piece's critique is against the set (LLM Evangelists), not against specific members of the set (the ones you mentioned). One can agree with the point of the piece while still acknowledging there are good programmers who are also LLM evangelists.

    • You are right, I went a bit too quickly so let me expand a bit my chain of thoughts.

      The article is against the set of LLM evangelists who are hostile towards the skeptics.

      I 100% agree with the part that basically says fuck you to them.

      However, explaining the hostile part with there being the feeling of insecurity (which is plausible but would need evidence) is not fully convincing and it seems dangerous to accept this conclusion and stop looking for the actual reasons this quickly.

      And the fact that there are actually good programmers persuaded that LLMs help them weakens the "insecurity" argument quite a bit, at least as the only explanation.

      As someone currently pretty much hostile to LLMs, I'm quite interested in what's currently at play but I'm suspicious of claims that initially feel good but are not strongly backed.

      Like, if these hostile people were actually shills, we would want to know this and not have closed the eyes too early because of some explanation that felt good, right? Or any actual reason.

    • That is the basic argument for the utility of stereotypes too. I think the author is engaged in projection. I think the hecklers are the ones revealing their insecurities, which are justified. Even if AI progress were frozen today, the market conditions are going to change in ways that are hard to predict and a little bit of programming knowledge is not going to be the massive arbitrage opportunity that it used to be.

    • Look into the Motte-and-bailey fallacy. That set seriously doesn't exist. Even the people on youtube doing "vibe coding" benchmarks mostly say it's crap. (Well they probably exist on twitter/linkedin.)

      This article just functions as flamebait for people who use LLMs to implement whole features to argue the semantics of "vibe coding". All while everyone is ignoring the writing on the wall. That we will soon have boxes going through billions of tokens every second. At that point slopcoding WILL be productive, but only if you build up the skill to differentiate yourself from the top 10% of prompters.

      2 replies →

  • Simon just explores stuff and writes about them. Doesn't mean he uses LLMs for everyting.Antirez likes to question stuff and make them better. Doesn't mean he uses LLMs for everything.

    Also their experience is not my experience. I will make my own choices.

  • I don't think it's fair to compare people who are already seemingly at the peak of their career, a place to which they got by building skill in coding. And in fact what they have now that's valuable isn't mostly skill but capital. They've built famous software that's widely used.

    Also they didn't adopt the your-career-is-ruined-if-you-don't-get-on-board tone that is sickeningly pervasive on LinkedIn. If you believe that advice and give up on being someone who understands code, you sure aren't gonna write Redis or Django.

  • I suspect for truly talented people, they just like talking to LLMs. And they're also not 100% focused on programming anymore, so the async nature of it matters more than it does to people who write code full time for a living.

  • > They are probably better than most of us

    most top engineers will have their best work locked up in their employer's private repositories

    simonw and antirez have an advantage here, and at least the former is very good at self-promotion

    • It's a tactical error to disagree with influencers in public, much less actually criticize them, since it only arouses the mob. When there's a power difference you have to cede the platform (since they already have it), and try to ask good questions.

      Questions like.. did we really even need to invoke particular influencers to discuss this issue? Why does that come up at all, and why is it the top comment? If names and argument from authority can settle issues on HN now, does it work with all credentialed authorities, or only those vocal few with certain opinions?

      1 reply →

> And then, inevitably, comes the character evaluation, which goes something like this:

I saw a version of this yesterday where a commenter framed LLM-skepticism as a disappointing lack of "hacker" drive and ethos that should be applied to making "AI" toolchains work.

As you might guess, I disagreed: The "hacker" is not driven just by novelty in problems to solve, but in wanting to understand them on more than a surface layer. Messing with kludgy things until they somehow work is always a part of software engineering... but the motive and payoff comes from knowing how things work, and perceiving how they could work better.

What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside. The human role will be hammering at problems which don't really have a "root cause" (except in a stochastic sense) and for which there is never any permanent or clever fix.

Would we say someone is "not really an artist" just because they don't want to spend their days reviewing generated photos for extra-fingers, circling them, and hitting the "redo" button?

  • I share your fear.

    We have a hard enough time finding juniors (hell, non-juniors) that know how to program and design effectively.

    The industry jerking itself off over Leetcode practice already stunted the growth of many by having them focus on rote memorization and gaming interviews.

    With ubiquitous AI and all of these “very smart people” pushing LLMs as an alternative to coding, I fear we’re heading into an era where people don’t understand how anything works and have never been pushed to find out.

    Then again, the ability of LLMs to write boilerplate may be the reset that we need to cut out all of the people that never really had an interest in CS that have flocked to the industry over the last decade or so looking for an easy big paycheck.

    • > to cut out all of the people that never really had an interest in CS

      I had assumed most of them had either filtered out at some stage (an early one being college intro CS classes), ended up employed somewhere that didn't seem to mind their output, or perpetually circle on LinkedIn as "Lemons" for their next prey/employer.

      My gut feeling is that messy code-gen will increase their numbers rather than decrease them. LLMs make it easier to generate an illusion of constant progress, and the humans can attribute the good parts of the output to themselves, while blaming bad-parts on the LLM.

      2 replies →

  • > What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside.

    I feel like its very true to the hacker spirit to spend more time customizing your text editor than actually programming, so i guess this is just the natural extension.

    • Even when 100% issue-oriented (that is, spending no time on editor-customizatons or developing other skill and toolkits) consider the difference between:

      1. This thing at work broke. Understand why it broke, and fix it in a way which stays and has preventative power. In the rare case where the cause is extremely shallow, like a typo, at least the fix is still reliable.

      2. This thing at work broke. The LLM zigged when it should have zagged for no obvious reason. There is plausible-looking code that is wrong in a way that doesn't map to any human (mis-)understanding. Tweak it and hope for the best.

  • There’s plenty of understanding we need to get in order to learn to steer the agents precisely, rather than, as you put it, mess around until it works. Some people are actively working on it, while others make a point of looking the other way.

Hearing people on tech twitter say that LLMs always produce better code than they do by hand was pretty enlightening for me.

LLMs can produce better code for languages and domains I’m not proficient in, at a much faster rate, but damn it’s rare I look at LLM output and don’t spot something I’d do measurably better.

These things are average text generation machines. Yes you can improve the output quality by writing a good prompt that activates the right weights, getting you higher quality output. But if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming. And yes, it matters sometimes. Look at the number of software bugs we’re all subjected to.

And let’s not forget that code is a liability. Utilizing code that was “cheap” to generate has a cost, which I’m sure will be the subject of much conversation in the near future.

  • > These things are average text generation machines.

    Funny... seems like about half of devs think AI writes good code, and half think it doesn't. When you consider that it is designed to replicate average output, that makes a lot of sense.

    So, as insulting as OP's idea is, it would make sense that below-average devs are getting gains by using AI, and above-average devs aren't. In theory, this situation should raise the average output quality, but only if the training corpus isn't poisoned with AI output.

    I have an anecdote that doesn't mean much on its own, but supports OP's thesis: there are two former coworkers in my linkedin feed who are heavy AI evangelists, and have drifted over the years from software engineering into senior business development roles at AI startups. Both of them are unquestionably in the top 5 worst coders I have ever worked with in 15 years, one of them having been fired for code quality and testing practices. Their coding ability, transition to less technical roles, and extremely vocal support for the power of vibe coding definitely would align with OP's uncharitable character evaluation.

    • > it would make sense that below-average devs are getting gains by using AI

      They are certainly opening more PRs. Being the gate and last safety check on the PRs is certainly driving me in the opposite direction.

  • After a certain experience level though, I think most of us get to the point of knowing what that difference in quality actually matters.

    Some seniors love to bikeshed PRs all day because they can do it better but generally that activity has zero actual value. Sometimes it matters, often it doesn't.

    Stop with the "I could do this better by hand" and ask "is it worth the extra 4 hours to do this by hand, or is this actually good enough to meet the goals?"

    • LLM generated code is technical debt. If you are still working on the codebase the next day it will bite you. It might be as simple as an inconvenient interface, a bunch of duplicated functions that could just be imported, but eventually you are going to have to pay it.

      10 replies →

    • "actually good enough to meet the goals?"

      There's "okay for now" and then there's "this is so crap that if we set our bar this low we'll be knee deep in tech debt in a month".

      A lot of LLM output in the specific areas _I_ work in is firmly in that latter category and many times just doesn't work.

      7 replies →

    • I think this is an opportunity for that bell curve/enlightenment meme. Of course as you get a little past junior, you often get hung up on the best way of doing things without worrying about best being the enemy of good enough. But truly senior devs know the difference. And those are the ones that by and large still think LLMs are bad at generating code where quality (reliability, sustainability, security, etc) matters. Everyone admits that LLMs are good for low stakes code.

    • Perhaps writing code by hand will be considered micro optimisation in the future.

      Just like writing assembly is today.

    • now sometimes that's 4 hours, but I've had plenty of times where I'm "racing" people using LLMs and I basically get the coding done before them. Once I debugged an issue before the robot was done `ls`-ing the codebase!

      The shape of the problem is super important in considering the results here

      3 replies →

  • I worked for a relatively large company (around 400 employees there are programmers). The people who embraced LLM-generated code clearly share one trait: they are feature pushers who love to say "yes!" to management. You see, management is always right, and these programmers are always so eager to put their requirements, however incomplete, into a Copilot session and open a pull request as fast as possible.

    The worst case I remember happened a few months ago when a staff (!) engineer gave a presentation about benchmarks they had done between Java and Kotlin concurrency tools and how to write concurrent code. There was a very large and strange difference in performance favoring Kotlin that didn't make sense. When I dug into their code, it was clear everything had been generated by a LLM (lots of comments with emojis, for example) and the Java code was just wrong.

    The competent programmers I've seen there use LLMs to generate some shell scripts, small python automations or to explore ideas. Most of the time they are unimpressed by these tools.

  • I've been playing with vibe coding a lot lately and I think in most cases, the current SOTA LLM's don't produce code that I'd be satisfied with. I kind of feel like LLM's are really really good at hacking on a messy and fragile structure, because they can "keep track many things in their head"

    BUT

    An LLM can write a PNG decoder that works in whatever language I choose in one or a few shots. I can do that too, but it will take me longer than a minute!

    (and I might learn something about the png format that might be useful later..)

    Also, us engineers can talk about code quality all day, but does this really matter to non-engineers? Maybe objectively it does, but can we convince them that it does?

    • > Maybe objectively it does, but can we convince them that it does?

      how long would you give our current civilisation if quality of software ceased to be important for:

        - medical devices
        - aircraft
        - railway signalling systems
        - engine management systems
        - the financial system
        - electrical grid
        - water treatment
        - and every other critical system
      

      unless "AI" dies, we're going to find out

    • Yet it is rare someone ever needs to write a PNG decoder.

      In the unlikely event you did, you would be doing something quite special to not be using an off-the-shelf library. Would an LLM be able to do whatever that special thing would be?

      It's true that quality doesn't matter for code that doesn't matter. If you're writing code that isn't important, then quality can slip, and it's true an LLM is good candidate for generating that code.

    • I kinda like Theo's take on it (https://www.youtube.com/watch?v=Z9UxjmNF7b0): there's a sliding scale of how much slop should reasonably be considered acceptable and engineers are well advised to think about it more seriously. I'm less sold on the potential benefits (since some of the examples he's given are things that I would also find easy by hand), but I agree with the general principle that having the option to do things in a super-sloppy way, combined with spending time developing intuition around having that access (and what could be accomplished that way), can produce positive feedback loops.

      In short: when you produce the PNG decoder, and are satisfied with it, it's because you don't have a good reason to care about the code quality.

      > Maybe objectively it does, but can we convince them that it does?

      I strongly doubt it, and that's why articles like TFA project quite a bit of concern for the future. If non-engineers end up accepting results from a low-quality, not-quite-correct system, that's on them. If those results compromise credentials, corrupt databases etc., not so much.

    • I tried vibe coding a BMP decoder not too long ago with the rationale being “what’s simpler than BMP?”

      What I got was an absolute mess that did not work at all. Perhaps this was because, in retrospect, BMP is not actually all that simple, a fact that I discovered when I did write a BMP decoder by hand. But I spent equal time vibe coding and real coding. At the end of the real coding session, I understood BMP, which I see as a benefit unto itself. This is perhaps a bit cynical but my hot take on vibe coders is that they place little value on understanding things.

      5 replies →

  • Claude Code is way better than I am at rummaging through Git history, handling merge conflicts, renaming things, writing SQL queries whose syntax I always forget (window functions and the like). But yeah. If I give it a big, non-specific task, it generates a lot of mediocre (or worse) code.

    • That's funny that's all the things I don't trust it to do. I actually use it the other way around, give it a big non-specific task, see if it works, specify better, retry, throw away 60% - 90% of the generated code, fix bugs in a bunch of places and out comes an implemented feature.

      2 replies →

  • As someone who: 1.) Has a brain that is not wired to think like a computer and write code. 2.) Falls asleep at the keyboard while writing code for more than an hour or two. 3.) Has a lot of passion for sticking with an idea and making it happen, even if that means writing code and knowing the code is crap.

    So, in short, LLMs write better code than I do. I'm not alone.

  • LLMs are not "average text generation machines" once they have context. LLMs learn a distribution.

    The moment you start the prompt with "You are an interactive CLI tool that helps users with software engineering at the level of a veteran expert" you have biased the LLM such that the tokens it produces are from a very non-average part of the distribution it's modeling.

    • True, but nuanced. The model does not react to "you are an experienced programmer" kinds of prompts. It reacts to being given relevant information that needs to be reflected in the output.

      See examples in https://arxiv.org/abs/2305.14688; They certainly do say things like "You are a physicist specialized in atomic structure ...", but the important point is that the rest of the "expert persona" prompt _calls attention to key details_ that improves the response. The hint about electromagnetic forces in the expert persona prompt is what tipped off the model to mention it in the output.

      Bringing attention to key details is what makes this work. A great tip for anyone who wants to micromanage code with an LLM is to include precise details about what they wish to micromanage: say "store it in a hash map keyed by unsigned integers" instead of letting the model decide which data structure to use.

  • It'd be rather surprising if you could train an AI on a bunch of average code, and somehow get code that's always above average. Where did the improvement come from?

    We should feed the output code back in to get even better code.

    • AI generally can improve through reinforcement learning, but this requires it to be able to compare its output to some form of metric. There aren't a lot of people I'd trust to RLHF for code quality, and anything more automated than that is destined to collapse due to Goodhart's Law.

  • In my view they’re great for rough drafts, iterating on ideas, throwaway code, and pushing into areas I haven’t become proficient in yet. I think in a lot of cases they write ok enough tests.

  • How are you judging that you'd write "better" code? More maintainable? More efficient? Does it produce bugs in the underlying code it's generating? Genuinely curious where you see the current gaps.

    • For me the biggest gaps in LLM code are:

      - it adds superfluous logic that is assumed but isn’t necessary

      - as a result the code is more complex, verbose, harder to follow

      - it doesn’t quite match the domain because it makes a bunch of assumptions that aren’t true in this particular domain

      They’re things that can often be missed in a first pass look at the code but end up adding a lot of accidental complexity that bites you later.

      When reading an unfamiliar code base we tend to assume that a certain bit of logic is there for a good reason, and that helps you understand what the system is trying to do. With generative codebases we can’t really assume that anymore unless the code has been thoroughly audited/reviewed/rewritten, at which point I find it’s easier to just write the code myself.

      2 replies →

  • This has been my experience as well.

    It's let me apply my general knowledge across domains, and do things in tech stacks or languages I don't know well. But that has also cost me hours debugging a solution I don't quite understand.

    When working in my core stack though it's a nice force multiplier for routine changes.

    • >When working in my core stack though it's a nice force multiplier for routine changes.

      what's your core stack?

  • > Hearing people on tech twitter say that LLMs always produce better code than they do by hand was pretty enlightening for me.

    That's hilarious LLM code is always very bad. It's only merit is it occasionally works.

    > LLMs can produce better code for languages and domains I’m not proficient in.

    I am sure that's not true.

  • > if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming

    even though this statement does not mathematically / statistically make sense - vast majority of SWEs are “below average.” therein lies the crux of this debate. I’ve been coding since the 90’s and:

    - LLM output is better than mine from the 90’s

    - LLM output is better than mine from early 2000’s

    - LLM output is worse than any of mine from 2010 onward

    - LLM output (in the right hands) is better than 90% of human-written code I have seen (and I’ve seen a lot)

  • I saw someone refer to it as future digital asbestos.

    • Asbestos is a miracle material with a serious and hard to really avoid downside.

      It's so good that we are genuinely left with crappy options to replace it, and people have died in fires that could have been saved with the right application of asbestos.

      Current AI hype is closer to the Radium craze back during the discovery of radioactivity. Yes it's a neat new thing that will have some interesting uses. No don't put it in everything and especially not in your food what are you doing oh my god!

  • The most prolific coders are also more competent than average. Their proliferations are what have trained these models. These models are trained on incredibly successful projects written by masters of their fields. This is usually where I find the most pushback is that the most competent SWEs see it as theft and also useless to them since they have already spend years honing skills to work relentlessly and efficiently towards solutions -- sometimes at great expense.

    • I'd assume most of the code visible on the web leans amateur. A huge portion of github repos seem to be from students these days. You'll see GitHub's Education page listing "5 million students" (https://github.com/education), which I assume is an under-estimate, as that's only the formal program.

    • > The most prolific coders are also more competent than average

      This is absolutely not true lol, as anyone who's worked with a fabled 10X engineer will tell you. It's like saying the best civil engineer is the one that builds the most bridges.

      The best code looks real boring.

      2 replies →

  • If firing up old coal plants and skyrocketing RAM prices and $5000 consumer GPUs and violating millions of developers' copyrights and occasionally coaxing someone into killing themselves is the cost of Brian Who Got Peter Principled Into Middle Management getting to enjoy programming again instead of blaming his kids for why he watches football and drinks all weekend instead of cultivating a hobby, I guess we have no choice but to oblige him his little treat.

The anti-LLM side seems much more insecure. Pro-LLM influencers are sometimes corny, but it's sort of like any other influencer, they are incentivized to make everything sound exciting to get clicks. Nobody was complaining about 3d printer influencers raving about how printing replacement dishwasher parts was going to change everything.

LLMs have also become kind of a political issue, except only the "anti" side even really cares about it. Given that using and prompting them is very much a garbage in/garbage out scenario, people let their social and political biases cloud their usage, and instead of helping it succeed, they try to collect "gotcha" moments, which doesn't reflect the workflow of someone using an LLM productively.

  • re: 3d printer hype: nah, there were people out there saying "umm, yeah so FDM is great for rapid prototyping and some small parts, but kinda terrible for lots of application. Being isotropic and lacking creep is way more important than you guys printing little boats that don't even float yet understand."

    You just couldn't hear them over all the hype.

    On the otherhand, that hype did help bring a lot of investment which begat things like desktop resin printers and prosumer-level SLS and such advanced tech that actually can replace much existing manufacturing tech.

    Current "AI"/LLM situation sure seems like a bubble, but in the end transformer-based ML is obviously insanely powerful for many domains and it's going to change and create industries. But just like there isn't a 3d printer in every house, and it doesn't seem like that will be a thing anytime soon, it doesn't seem likely that we're all going to just be prompt engineers.

  • > The anti-LLM side seems much more insecure.

    I can't agree with that.

    The pro-LLM side is relentlessly pushing everyone into using it, even when very few people want it (e.g. WhatsApp not even allowing people to turn it off). That smacks of insecurity to me.

    Anti-LLM people are happy to continue coding/writing/drawing without the new digital tools - that sounds like someone who knows what they're doing and feels secure about their abilities.

    > LLMs have also become kind of a political issue, except only the "anti" side even really cares about it

    The stock market would disagree to the tune of a large sum of money. What the "anti" side care most about is being forced to pay for (typically hidden costs such as a percentage of your pension investments) and often forced to use "AI" despite knowing that the service is substantially worse than getting people to do the same job (e.g. "AI" support service agents)

    • You cannot post anything positive about your experience with LLMs without being dogpiled by the anti-ai zealots. They are as predictable as the "But What About China!" crowd any time US emissions come up in a discussion. They are the "Nothing will work but nuclear!" folks who crop up in every single conversation about renewable energy. Their comments are exactly the same every single time regardless of the context and add nothing at all to the conversation.

      1 reply →

"LLM evangelists - are you willing to admit that you just might not be that good at programming computers?"

No.

  • I do wonder if C programmers ever asked that of Python devs back in the day.

    • Back in the day, Python devs commonly were C programmers.

      Someone had to do the implementation, after all. And the C API was (and still is) kind of a big deal.

      There's a reason the standard library is full of direct ports of C libraries with unsightly, highly un-Pythonic names and APIs. (Of course, it's also full of direct ports of Java libraries with unsightly, highly un-Pythonic architecture.)

    • This is still a thing today. There have been multiple times I oneshot some project that leadership had been waiting on some team forever to finish, and 90% of it was them refusing to touch a "noob" lang like Python or JS.

  • Came here to comment on this line: it completely changes the tone of the article. It's fairly reasonable and neutral until we get here, upon which the antagonism is jarringly clear.

    In fact I would posit this is the central crux of the post: OP does not believe those LLM evangelists were ever good programmers.

    As others have already noted[1], many well-known excellent programmers - including yourself! and now even Linus! - would beg to differ.

    [1] https://news.ycombinator.com/item?id=46610143

  • You can stop questioning yourself early whether you are a good programmer just by realizing you code in python.

How much longer until we get to just... let the results speak for themselves and stop relitigating an open question with no clear answer.

We're well past ad nauseum now. Let's talk about anything else.

The tech industry seems to attract people that feel personally attacked when someone else makes different choices that they do.

"Why are you using Go? Rust is best! You should be using that!" "Don't use AWS CDK, use Terraform! Don't you know anything?"

  • Keep in mind that anyone posting on a forum (just like this), or so much more, anyone blogging about something is already a huge selection bias for people who believe that their opinion needs to be shared.

    You don't hear from all the people who don't feel that others must know their opinion.

    Lurkers always outweigh posters.

    Don't ever make the mistake of believing that a sample of posts is a sample of people

  • Humans are tribal, which has both benefits and costs.

    In technology, the historical benefits of evangelizing your favorite technology might just be that it becomes more popular and better supported.

    Even though LLMs may or may not follow the same path, if you can get your fellow man on-board, then you'll have a shared frame of reference, and someone to talk through different usage scenarios.

  • You need to be fairly smart to be in tech. People who grew up smart and were told they were tend to view it as part of their self worth. If someone disagrees with this person later on, their self with has been attacked so of course they are going to lash out.

    The worst thing you can say to a dev is they are wrong. Most will do everything in their power to prove otherwise, even on the dumbest of topics.

Using LLM for programming work is a skill that takes practice and sometimes a little luck, just like Google searching for help with programming work.

It takes practice to figure out which things the LLM handles well and how best to present your problems to the LLM to get a good result.

It take luck that the specific things you're trying to get results for are things the LLM actually can handle well.

5 anti-AI posts on the home page of Hacker News…yeah, plenty of insecure evangelism amongst the skeptics, too.

  • Is there enough of new blood on HN? For me it was the best place, my favorite website, when I was entering startup scene. Loved it. I don't think a lot of young founders I know ever go here...

    • > young founders

      While this place has always been attractive to people building startups, back in the day (my original account is from 2009) "Hacker" News was much more about Hackers. Most people posting here had read "On Lisp", respected Paul Graham as a programmer and were enthusiastic about programming and solving problems above all else.

      I'm honestly curious how many people that visit HN today even know what a "y combinator" is, and I have a pretty reasonable guess as to how many have implemented it for fun (though probably the applicative order version).

    • They are on LinkedIn now posting hustle bro posts about how they wake up at 4 am for yoga while Claude Code generates everything for them.

"LLM evangelists - are you willing to admit that you just might not be that good at programming computers?"

The people who were the best at something don't necessarily be the best at a new paradigm. Unlearning some principles and learning new ones might be painful exercise for some masters.

Military history has shown that the masters of the new wave are not necessarily the masters of the previous wave we see the rise and downfall of several civilizations from Roman to Greek for being too sure of their old methods and old military equipments and strategy.

I think the author slips into the same pattern he’s criticizing. He says LLM fans shouldn’t label skeptics as “afraid” then he turns around and labels the fans as “insecure” or “not very good at programming.” It’s the same move; guessing what’s going on in someone’s head instead of sticking to what actually happened and what the tools can or can’t do. The simpler truth is LLMs are great in some cases and painful in others. They shine on boilerplate and tests. They struggle when the domain is unusual, requirements are fuzzy; mistakes are made, you pay a big babysitting tax.

Instead of psychoanalyzing each other, people should share concrete examples

I have been through this before (wherever/whenever the money seems to flow) - databases are bad, you should use couchbase etc, I was a db expert, the people advocating weren't, but they were very loud. The many, many evangelistic web development alternatives that come and go, all very loud. Now the latest - LLM's, like couchbase et al they have their place but the evangelists are not having any of it.

I work a lot with doctors (writing software for them), I am very envious of their system of specialisation, eg this dude is such and such a specialist - he knows about it, listen to him. IT seems to be anyone who talks the loudest has a podium, separating the wheat from the chaff is difficult. One day we will have a system of qualifications I hope, but it seems a long way off.

I feel no strong need to convince others. I've been seeing major productivity boosts for myself and others since Sonnet 3.5. Maybe for certain kinds of projects and use cases it's less good, maybe they're not using it well; I dunno. I do think a lot of these people probably will be left behind if they don't adopt it within the next 3 years, but that's not really my problem.

  • What's there to be left behind on? That's like arguing people who stick to driving cars with manual transmissions are going to get left behind when buses "inevitably get good."

    The whole point of the AI coding thing is that it lets inexperienced people create software. What skill moat are you building that a skilled software developer won't be able to pick up in 20 minutes?

    • (Dont take this as an attack on you personally, just on the sentiment you are giving in your comment)

      The attitude you present here has become my litmus test for who has actually given the agents a thorough shake rather than just a cursory glance. These agents are tools, not magic (even though they appear to be magic when they are really humming). They require operator skill to corral them. They need tweaking and iteration, often from people already skilled in the kinds of software they are trying to write. Its only then that you get the true potential, and its only then you realize just how much more productive you can be _because of them_, not _in spite of them_. The tools are imperfect, and there are a lot of rough edges that a skilled operator can sand down to become huge boons for themselves rather than getting cut and saying "the tools suck".

      Its very much like google. Anyone can google anything. But at a certain point, you need to understand _how to google_ to get good results rather than just slapping any random phrase and hoping the top 3 results are magically going to answer you. And sometimes you need to combine the information from multiple results to get what you are going for.

      1 reply →

    • Your analogy is the wrong way around :)

      Everyone now is driving automatic, LLMs are the manual transmission in a classic car with "character".

      Yes, anyone can step into one, start it and maybe get there, but the transmission and engine will make strange noises all the way and most people just stick to the first gear because the second gear needs this weird wiggle and a trick with the gas pedal to engage properly.

      Using (agentic) LLMs as coding assistants is a skill that (at the moment) can't really be taught as it's not deterministic and based a lot on feels and getting the hang of different base models (gemini, sonnet/opus, gpt, GLM etc). The only way to really learn is by doing.

      Yes, anyone can start up Google Antigravity or whatever and say "build me a todo app" and you'll get one. That's just the first gear.

  • > "...these people probably will be left behind if they don't adopt it..."

    And there it is, the insecure evangelism.

    • I'm not sure you understand what insecure means. Do you think it means people aren't able to have an opinion about other people's skills, values, attitudes, and behaviors, and what those might ultimately result in?

      1 reply →

To be fair to both sides, it really is hard to tell if we're in the world of

"you'll be left behind if you don't learn crypto" with crypto

or

"you'll be left behind if you don't learn how to drive" with cars

One of those statements is made in good faith, and the other is made out of insecurity. But we'll probably only really be able to tell looking backwards.

  • I don't really see the point in worrying about prompting, and agents and mcps, and skills and all this as a skill to learn. They're trivial if you're already a developer and if it gets good enough that you don't need to know software engineering, there's not going to be anything to learn. Setting it up is a subset of software engineering so it will be able to do that its self once it can solve problems reliably without someone to understand what's going on to check it.

  • As an aside, both those statements were wrong. People who learned to drive well after cars were widely adopted were at no particular disadvantage when and if they decided to adopt the technology. You can see this is true by noting that at this point, no one alive learned to drive when cars first came out.

> But doing "prompt-driven development" or "vibe coding" with an Agentic LLM was an incredibly disapointing experience for me. It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong. All the while I sat there feeling dumber and dumber, as my tokens drained away.

Yeah I find they are useful for large sweeping changes, introducing new features and stuff, mostly because they write a lot of the boilerplate, granted with some errors. But for small fiddly changes they suck, you will have a much easier time doing these changes your self.

I think it goes further than this. Some people - some developers, even - do not _like_ programming computers. In fact, many hate it. Those people welcome the LLM agent stuff because it delivers the end product without going through the necessary pain (from their pov) of programming.

  • While I believe this may be true, there are also just people that get more reward from building than from the act of writing code. That doesn't mean they hate writing code, but that the building comes first. I count myself in that camp.

    If I can build better/faster with reasonably equal quality, I'll trade off the joy of programming for the joy of more building, of more high level problem solving and thinking, etc.

    I've also seen the opposite: those that derive more joy from the programming and the cool engineering than from the product. And you see the opposite behavior from them, of course--such as selecting a solution that's cool and novel to build, rather than the simple, boring, but better alternative.

    I often find this type of engineer rather frustrating to work with, and coincidentally, they seem to be the most anti-AI type I've encountered.

    • Yeah, that's pretty much what I think too. I'm much more of the latter type you mention, but I think I have the enough acumen to be practical most times.

      Its always been the case that engineers come in many flavors, some more and some less business-inclined. The difference with AI imo is that it will (or already is) putting its trillion-dollar finger on the scale, such that there is less patience and space for people like me, and more for people like you.

  • I, for one, love programming (and passionately hate AI).

    Note I also think AI is bad for philosophical/ethical reasons.

"I find LLMs useful as a sort of digital clerk - searching the web for me, finding documentation, looking up algorithms. I even find them useful1 in a limited coding capacity; with a small context and clear guidelines."

I am curious why the author doesn't think this saves them time (i.e. makes them more productive).

I never had terribly high output as a programmer. I certainly think LLMs have helped increased the amount of code that I can write, net total, in a year. Not to superhuman levels or even super-me levels, just me++.

But, I think the total time spent producing code has gone down to a fraction and has allowed me more time to spend thinking about what my code is meant to solve.

I wonder about two things: 1. maybe added productivity isn't going to be found in total code produced, because there is a limit on how much useful code can be produced that is based on external factors 2. do some devs look at the output of an LLM and "get the ick" because they didn't write it and LLM-code is often more verbose and "ugly", even though it may work? (this is a total supposition and not an accusation in any way. i also understand that poorly thought out, overly verbose code comes with problems over time)

  • > "get the ick"

    > even though it may work?

    The first of those is about taste, and it's real, and engineers with bad taste write unstable buggy systems.

    The second of those is about priority. If all you want is functional code, any old thing will do. That's what I do for one-off scripts. But if you plan to support the code at 2am when exposed to production requests on the internet, you need to understand it, which is about legibility and coherence.

    I hope you do have taste, and I hope you value more than simple "it works" tests. But it might be worth looking there for why some struggle with LLM output.

    For what it's worth, I use coding agents all the time, but almost never accept their output verbatim outside of boilerplate code.

  • See for some, "over time" means "the next guy's problem". That is true before LLMs of course. And that's even the prevailing school of thought in most organizations. And to some extent it is correct to accept some tech debt because otherwise you'll never get anything done.

    For those who have been around a while, dealing with the "over time" of yesteryear is a daily occurrence. So naturally they are more averse to it. And LLMs seem to dramatically shorten the duration of "over time".

  • It seems you find LoC as a measure of productivity. This would answer your question as to why the author does not find it makes them more productive. If total output increases, but quality decreases (which in terms of code means more bugs) then has productivity increased or has it stayed the same?

    To answer my own question, if you can pump out features faster but turn around and spend more time on bugs than you do previously then your productivity is likely net neutral.

    There is a reason LoC as a measure of productivity has been shunned from the industry for many, many years.

    • I didn't mean to imply LoC as a measurement of productivity. What I really mean is more "amount of useful code produced to a level the human-using-the-llm determines to be useful".

      To try and give an example, say that you want to make a module that transforms some data and you ask the LLM to do it. It generates a module with tons of single-layer if-else branches with a huge LoC. Maybe one human dev looks at it and says, "great this solves my problem and the LoC and verbosity isn't an issue even though it is ugly". Maybe the second looks at it and says, "there's definitely some abstraction I can find to make this easier to understand and build on top of."

      Depending on the scenario and context, either of them could be correct.

    • LoC is a terrible metric for comparing productivity of different developers, even before you get to Goodhart's Law.

      OTOH, for a given developer to implement a given feature in a given system, at the end of the day, some amount of code has to be written.

      If a particular developer finds that AI lets him write code comparable to what he would have written, in lieu of the code he would have written, but faster than he can do it alone, then looking at lines written might actually be meaningful, just in that context.

  • I feel like it makes me more productive, but I am not even sure it does even with my light usage. How do we even measure it?

    • I also feel like it makes me more productive but measuring software engineering productivity is famously difficult. If there was an easy way to measure it, managers at bigco would have employed it with abandon years ago.

> It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong.

Can’t speak for others but that’s not what I’d understand (or do) as vibecoding. If you’re babysitting it every inch of the way then yeah sure I can see how it might not be productive relative to doing it yourself.

If you’re constantly fighting the LLM because you have a very specific notion of what each line should look like it won’t be a good time.

Better to spec out some assumptions, some desired outcomes, tech to be used, maybe the core data structure, ask the llm what else it needs to connect the dots, add that and then let it go

I like OP's representation, but I feel like a lot of people arent saying 'LLMs are the bomb dot com _right now_' (though some are), but rather the trend is evident: these things will keep getting better, and the writing is on the wall.

Personally I think the rate of improvement will plateau: in my experience software inevitably becomes less about tech and more about the interpersonal human soup of negotiating requirements, needs, contradiction, and feedback loops, at lot of which is not signal accessible to a text-in-text-out engine.

  • It "becomes"? In a lot of areas, particularly enterprise, business stuff, it had been mostly about all of these things for decades.

    • "becomes" as in the locus of focus as my career progresses. Vibes-wise, i think we agree / are saying the same thing.

This is also bad evangelism, but on opposite side.

Just because LLMs don't work for you outside of vibe-coding, doesn't mean it's the same for everyone.

> LLM evangelists - are you willing to admit that you just might not be that good at programming computers?

Productive usage of LLMs in large scale projects become viable with excellent engineering (tests, patterns, documentation, clean code) so perhaps that question should also be asked to yourself.

I tend to share the sentiment of the author.

I think that coding assistants tend to quite good as long as what you ask is close to the training data.

Anything novel and the quality if falling off rapidly.

So, if you are like Antirez and ask for a Linenoize improvement that has already be seen many times by the LLM at training time, the result will seem magical, but that is largely an illusion, IMO.

  • But how many times a week are you truely doing something novel? A thing that nobody in the world (or the LLM training data) has done before?

I generally agree with this article and appreciate the articulation of some points that were in my head but not well formed.

That said, practically speaking, you can just ignore the evangelists and even use it as a signal for people you should ignore.

The unfortunate thing is how it's affecting our careers, since it's apparently so alluring for Management types to buy the hype and start trying to micro manage devs.

> You see a lot of accomplished, prominent developers claiming they are more productive without it.

Demonstrably impossible if you’re actually properly trying to use them in non-esoteric domains. I challenge anyone to very honestly showcase a non-esoteric domain in which opus4.5 does not make even the most experienced developer more productive.

  • How would one set this sort of test up? I surely have example domains where LLMs routinely do poorly (for example, custom bazel rules and workspaces), but what would constitute a "showcase" here?

    • To change my mind I’ll be satisfied with a thorough description of the domain and ideally a theory on why it does poorly in that domain. But we’re not talking LLMs here, we’re talking opus4.5 specifically.

      3 replies →

I feel any LLM discussion without mentioning concrete tooling choices are unproductive. Toolings are evolving so fast, many statements that were accurate two months ago are simply wrong today.

  • It's a lot different to "vibe code" by copypasting crap from a browser window to an editor vs using an agentic LLM with full access to the source and tools to search for documentation, run scripts etc.

    And one major thing is language.

    Some languages (Rust, React) are so complex and nuanced that LLMs struggle with them - as do humans. Agentic LLMs will eventually solve the problem you've given them but the solution might be a bit wonky.

    Compare that to LLMs writing Python or Go. With Go there's just one way to write a loop, it can't get confused with that. The way to write and format the language has been exactly the same since the beginning.

    Same with Python, it's pretty lenient on how you write it (objects vs functional) but there are well-estabilished standards on how to do things and it's an old language (34 years btw). Most of Python 2.x is still valid Python 3.

I remember a similarly aggressive evangelism about self-driving cars several years ago. I suppose it's not so pleasant, when you feel like you've seen a prophetic glimpse of a brilliant future, to deal with skeptics who don't understand your vision and refuse to give your predictions the credit they deserve.

Of course we need a few people to get wildly overexcited about new possibilities, so they will go make all the early mistakes which show the rest of us what the new thing can and cannot actually do; likewise, we need most of us to feel skeptical and stick to what already works, so we don't all run off a cliff together by mistake.

> That doing this is it's own skill and I have not spent enough time with it.

Yeah, this.

I sucked (still sucks?) at it too, I spent countless hours correcting them. And throwing away hours of "work" they made, And even had them nuking the workplace a couple times (thankfully, they were siloed). I still feel like I'm wasting too much time way too often and trying new things constantly.

But I always thought I can learn and improve on this tool and its associated ecosystem as much as the other programming tools and languages and frameworks I learned over the years.

"I am still willing to admit I am wrong. That I'm not holding the GPS properly. That navigating with real-time satellite data is its own skill and I have not spent enough time with it. I have changed how I get around before, and I'm sure I will do so again.

Map-reading evangelists, are you willing to admit that you just might not be that good at driving a car? Maybe you once were. Maybe you never were."

I just want some externally verifiable numbers. If AI is a 10x improvement, we should be seeing new operating systems. If it’s 5x we should see new game engines. If it’s 2x we should see massive amounts of new features for popular open source projects.

If it’s less than that, then it’s more like adding syntax highlighting or moving from Java to Ruby on Rails. Both of those were nice, but people weren’t breathlessly shouting about being left behind.

  • I dont think that will happen, Not at least now. I dont even think AI would be suitable for creating an OS or a game engine. What seems to be increasing is no of saas products asking for subscriptions and most of the productivity, if any, is being used to add AI features to the apps.

Until some days / weeks ago, LLM's for coding was more hype than actually real code producing. That is gone now. They clearly leveled up, things will not be the same anymore. And of course this is not just for coding, this is just the beginning. A month ago it really seemed that the models were hitting a complexity wall and that the architecture would need to be improved. Not anymore.

  • I have seen people say something along these lines what feels like every month for the past year.

    • It's different this time. You can see that at the same time, they are finally, definitely solving hard mathematical problems. They passed the phase of just being like good search engines to being actual generators of new data from their generalizations. I can give you a simple example. Any code they generated before brought frustration and they would loop with feedback. Now they actually produce "human level" code.

      1 reply →

Whatever standard of code quality you want, if it hasn't reached it yet, it will get there very soon.

Whatever your personal feeling, judgement, or conviction on this matter; do not dismiss the other side because of a couple wingnuts saying crazy stuff (you can find them on both extremes as well as the middle). Stay curious as to why people have their own conviction, and seek the truth!

I see it only as a threat to those who have a deep hook into their role as a SWE.

If as a SWE you see the oncoming change and adapt to it, no issue.

If as a SWE you see the enablement of LLMs as an existential threat, then you will find many issues, and you will fail to adapt and have all kind of issues related to it.

  • If a company wants to cut a lot of SWE it wouldn't matter if you have adapted or not, they will just cut as much as they can. How can you adapt more than another SWE? These tools seem easy to learn and there isn't a massive learning curve if you're a programmer. I wonder if changing to a different role would work but I would be skeptical, because after SWEs they will try to cut the other jobs and if there is no one at the bottom, middle management doesn't make sense.

  • Read the article. The author wants LLMs to work for them, but they don’t. You’re confusing results and intention.

What I really like about LLMs is that you can do pair programming without having to deal with humans.

if A.I maximalist gospel was true - we would see a company raising $10M Series A | Seed (these days)

spend 60% on A.I, 30% on Humans and 10% on operations but I can bet you my sole penny that's not happening - so we know someone is tryna sell us a polished turd as a diamond

  • At no point in history has humanity ever cut back on spending after some constraint got alleviated. Exact opposite, we always ramp spending up to chase new possibilities.

    If A.I maximalism gospel was true we would see companies raising absurd seed and A rounds in record numbers. Which is exactly what we’re seeing

I see how cool and powerful they're getting, but agree there is a huge insecurity element in the evangelism. Everyone wants to be seen as the one who will get a seat running the llms when the music stops playing.

I see a lot more insecurity from people who refuse to use AI coding tools. My teammates amd I use this stuff all the time, and it's not making a statement, it's just an easier path sometimes.

I somewhat agree with this poster. However, I think the unfortunate reality of programming for money is that a mediocre programmer that pumps out millions of lines of slop that seems to drive the business forward and manages to hide disastrous bugs until after the contract / promotion cycle is over will get further ahead than the more competent programmer that delivers better, less buggy, less spaghetti code.

  • I mean, isn't driving the business forward really what matters (outside of academia, open source, and other such endeavors). We live in a hyper competitive market. All else being equal, if company A can produce "millions of lines of slop", constantly living on the knife-edge of disaster but not falling over it, they will beat company B that artificially slows themselves down. Up until the point company A implodes, but that's not necessarily a given if pre-LLM companies are any indication.

    • Sounds like you should go bundle sub-prime mortgages into some complex securities, if you like intentionally living on the knife's edge of disaster.

      2 replies →

    • This is not reality for most companies. Some have billions in bank but still produce slop. Its because their internal systems rewards slop.

  • Most of us are paid to solve problems and deliver features, not craft the most perfect code known to man.

    If the slop-o-matic next to you is delivering 5 features a week without tripping up QA and you do one every two weeks - which one will the company pick when layoffs hit again?

    • Most developer jobs are like working at Ikea, but most developers on HN pretend they are fine furniture craftsman and that's really what everyone needs or the whole world will fall apart. It turns out, the vast majority of the population is quite happy with a LACK side table and their carefully crafted dovetail joinery adds nothing but expense to the average use case.

      1 reply →

I'm dubbing this "podcast driven development" because so many of them aren't building things to build things, they just want to _have built something_ so they can go on podcasts and talk about how great it is.

For what it's worth, I think most of them are genuine when they say they're seeing 10X gains,they just went from, like, a 0.01X engineer (centi-swe) to to a 0.1X engineer (deci-swe).

This is a fun piece to dissect because it's self-aware about being uncharitable, yet still commits the very sin it's criticizing.

The author's central complaint is that LLM evangelists dismiss skeptics with psychological speculation ("you're afraid of being irrelevant"). Their response? Psychological speculation ("you're projecting insecurity about your coding skills").

This is tu quoque dressed up as insight. Fighting unfounded psychoanalysis with unfounded psychoanalysis doesn't refute anything. It just levels the playing field of bad arguments.

The author gestures at this with "I am still willing to admit I am wrong" but the bulk of the piece is vibes-based counter-psychoanalysis, not engagement with evidence.

It's a well-written "no u" that mistakes self-awareness ("I know this isn't charitable") for self-correction.

"LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were."

lol. is this supposed to be like some sort of "gotcha"! yes? like maybe i am a really shitty programmer and always just wanted to hack things together. what it has allowed me to do is prevent burnout to some extent, outsource the "boring" parts and getting back to building things i like.

also getting tired of these extreme takes but whatever, it's #1 so mission accomplished. llms are neither this or that. just another tool in the toolbox, one that has been frustrating in some contexts and a godsend in others and part of that process is figuring out where it excels and doesn't.

  • Not really.

    I use LLMs for things I am not good at. But I also know I am not good at them.

    No one is good at everything. 100% fine.

    • hmm, maybe you are not as good at using llms as you think then? lol jk.

      i mean if you have imposter syndrome then this feeling will always be prevalent. how do you know what you are good at or not? i might be competent enough to have progressed this far in my career as in "results", but comparison to people i consider "good" devs always puts in that doubt.

      i guess it strikes a chord when someone in the same breath of claiming to be open minded makes a backhand comment where people who like llms might just must be a shitty programmer or whatever. i get the point, but that line doesn't quite land the way you think it does.

      1 reply →

Evangelism of a new technique and tool for doing work is insecure? I agree it’s been oversold but it’s natural to be pretty excited about the tech.

I don't get who is saying this dreaded "you'll be left behind." The only place I see that is from straight-up slop accounts in the Twitter algo feed. Surely you're not letting those people make you feel bad.

> You see a lot of accomplished, prominent developers claiming they are more productive without it.

You also see a lot of accomplished, prominent developers claiming they are more productive with it, so I don't know what this is supposed to prove. The inverse argument is just as easy to make and just as spurious.

> LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were.

Or maybe the author is bad at programming AND bad at agentic coding.

That’s more likely than the possibility that all llm evangelists are terrible coders.

> You tried agentic coding. You realised it was better at programming than you are.

We need to drop this competition paradigm ASAP.

I don't mind weighing in as someone who could fairly be categorized as both an LLM evangelist and "not an experienced dev".

It's a lot like why I've been bullish on Tesla's approach to FSD even as someone who owned an AP1 vehicle that objectively was NOT "self-driving" in any sense of the word: it's less about where the technology is right now, or even the speed the technology is currently improving at, and more about how the technology is now present to enable acceleration in the rate of improvement of performance, paired with the reality of us observing exactly that. Like FSD V12 to V14, the last several years in AI can only be characterized as an unprecedented rate of improvement, very much like scientific advancement throughout human society. It took us millions of years to evolve into humans. Hundreds of thousands to develop language. Tens of thousands to develop writing. Thousands to develop the printing press. Hundreds to develop typewriters. Decades to develop computers. Years to go from the 8086 to the modern workstations of today. The time horizon of tasks AI agents can now reliably perform is now doubling every 4 months, per METR.

Do frontier models know more than human experts in all domains right now? Absolutely not. But they already know far more than any individual human expert outside that human's domain(s) of expertise.

I've been passionate about technology for nearly two decades, working in the technology industry for close to a decade. I'm a security guy, not a dev. I have over half a dozen CVEs and countless private vuln disclosures. I can and do write code myself - I've been writing scripts for various network tasks for a decade before ChatGPT ever came into existence. That said, it absolutely is a better dev than me. But specialized harnesses paired with frontier models are also better security engineers than I am, dollar for dollar versus my cost. They're better pentesters than me, for the relative costs. These statements were not true at all without accounting for cost two years ago. Two years from now, I am fully expecting them to just be outright better at security engineering, pentesting, SCA than I am, without accounting for cost, yet I also expect they will cost less then than they do now.

A year ago, OpenAI's o1 was still almost brand new, test-time compute was this revolutionary new idea. Everyone thought you needed tens of billions to train a model as good as o1, it was still a week before Deepseek released R1.

Now, o1's price/performance seems like a distant bad dream. I had always joked that one quarter in tech saw as much change as like 1 year in "the real world". For AI, it feels more like we're seeing more change every month than we do every year in "the real world", and I'd bet on that accelerating, too.

I don't think experienced devs still preferring to architect and write code themselves are coping at all. I still have to fix bugs in AI-generated code myself. But I do think it's short sighted to not look at the trajectory and see the writing on the wall over the next 5 years.

Stanford's $18/hr pentester that outperforms 9/10 humans should have every pentester figuring out what they're going to be doing when it doubles in performance and halves in cost again over the next year, just like human Uber drivers should be reading Motortrend's (historically a vocal critic of Tesla and FSD) 2026 Best Driver Assistance System and figuring out what they're going to do next. Experienced devs should be looking at how quickly we came from text-davinci-003 to Opus 4.5 and considering what their economic utility will look like in 2030.

  • I agree with your POV. Technology has been improved exponentially and from a human POV having systems smarter than us can be good (if we have control over them but to think about that makes us smarter and powerful, but that's another discussion).

    I agree with this but I just don't believe that the "promised AI" would be an LLM but some other technology.

    > But specialized harnesses paired with frontier models are also better security engineers than I am, dollar for dollar versus my cost. They're better pentesters than me, for the relative costs.

    Don't mind but some parts of cybersecurity are more brute-force in nature, where AI excels, but using that to claim frontier models are better security engineers than humans is a bit of an oversimplification. You are BETTER because of your expertise of X years in cyber security, and I can assure you that there is not frontier model out there with the context capacity to challenge you on that. And that makes you a 10x or even 100x more cost effective (dollar-wise). And your (perceived) cost will go on increasing with each adding year to your experience. AI can do things faster than you but your judgement on which subtree/direction to traverse while pentesting makes you a better than an LLM and the cost of this (human) judgement will increase day-by-day because other humans need other humans to take accountability.

  • 5 years? That seems generous. We are being threatened this summer (in some companies its gonna be even earlier)

    • Yeah, I'm being a little generous/conservative here, but also, that 2030 estimate is more along the lines of the "everyone unambiguously understands AI is better than the experts in their respective domains", not for the much sooner "it becomes more economically viable to have AI devs than human devs".

> It's projection. Their evangelism is born of insecurity.

It's fear, but of different kind. Those who are most aggressive and pushy about it are those who invested too much [someone else's] money in it and are scared angry investors will come for their hides when reality won't match their expectations.

This has been my experience as well.

What I use LLMs for

I don't use Claude Code, Codex or any other hyped-up, VC-approved, buzzword-compliant, productivity-theater SF tool for actual day-to-day coding. I just use ChatGPT outside of my editor in browser.

The only tasks that I feel LLMs right now can do somewhat-reliably without human supervision include:

1. Search: To search and compare different things; that is boring to do manually. As an example, I checked how M1 Pro compares to M4. ChatGPT crawls the web to compare both the processors on similar specs and provides a conclusion but I need to check and verify the claims it reached by briefly reading the sources.

2. Documentation: I have to do things on time so I offload reading documentations for syntax or how-to-do-it-with-this-library to ChatGPT.

3. Single task scripts: I let ChatGPT create me boring and one-time-use python scripts for trivial tasks.

I don't spend ChatGPT's limited but free GPT-5-(latest-model) tokens for such trivial tasks - I have a special keybinding on my mac that fires up ChatGPT in incognito for me.

But don't get me wrong, I find LLMs pretty useful for PoC-level stuff and I do like instructing my agents to make me a half-baked PoC that makes me feel like a revolutionary SF-founder for an afternoon before I delete the repo and go back to writing real code like a responsible adult. Because nothing says ‘disruptive genius’ like ctrl+z and moving on.

---

Why I find vibe coding in production not useful

Like sean goedecke, I feel that using LLMs in production is not useful and (in fact) anti-productive because it gives junior engineer's free pass to push code without actually learning/reading it (Humans often tend to chose the easiest path to accomplish given problem). And due to this, its a pain to review junior PR these days.

I have two reasons why LLMs are not good at production coding:

1. Context: There are enough discussions out on the internet that says LLMs are limited by context. Humans are better because they can handle and work with multiple contexts in mind - an LLMs knows how to solve the user problem, but the engineer knows what business problem this piece of code solves, in which context the code will run, with other process on same service and in which context of other services, what company code idioms to follow while writing code and what will make your pair-programmer think that you are smart.

2. Lack of deterministic output: I would not over-simply by saying its just a probabilistic system but you can feel it right? It would sometimes give correct, decent and sometimes wrong answers for the same prompt. Sometimes it would enlighten you with a very interesting insight/perspective and other times it would be as dumb as a toaster trying to file your taxes - consistently inconsistent (when you reached context rot), yet somehow still sold as “intelligent.”

---

A brief rant

My manager fear-mongers me nearly everyday saying LLMs are really good at coding, there are at-most 10 years left to earn as programmer and stuff. The fact he is hasn't touched coding in last 8 years.

He thinks his half-baked trivial ideas translated to code by an overpriced probabilistic system is a breakthrough for his company.

I am non-native english speaker and you would have probably sensed that by now. I often get tasks to write documentations, and every time my manager doesn't understand something, he takes it as a chance to break my confidence saying that you are bad at using LLM for language related tasks because my english is poor.

But (at-least right now) I strongly feel this is just a hype:

- CEOs want to get-rich-quick or they want to increase company profits by "ai-enabling" it and getting salary hikes.

- Companies are hyping this tech to recalibrate engineer salaries that has skyrocketed since covid.

- My manager wants to make me feel replaceable.

  • If you do not at least have a tool to feed the context of the LLM then you will always have a b*tch of an experience, I believe. We tried out an LLM in our company which had almost no context management. It sucked. Everyone hated it.

    Queue copilot (which is arguably likelty not the ultimate toolchain) which can at least read files it needs etc, either explicitly or implicitly, is a whole other ballgame and it works 100x better.

mitchellh talked about how he vibe coded the one off visualization code for some blog post of his recently, and he seems like a fairly good programmer

  • For good craftspeople bad tools are still tools. For bad craftspeople tools, good ones and bad ones, are just a way to produce more crap.

See the same thing in the bitcoin space. If you ask them to explain the value to you, you're a moronic, behind-the-times, luddite boomer who just doesn't understand. Not to mention poor!

I'll remain skeptical and let the technology speak for itself, if it ever does.

Can we just agree that both the pro- and anti-llm faction mostly contribute noise? And go back to discuss actual achievements?

It's trivial to share coding sessions, be they horrific or great. Without those, you're hot air on the internet, independent of whatever specific opinions on LLMs you voice.

LLMs are really great at copy/pasting answers from stack overflow and fitting them to work in a given system. If your work is outside what is answerable on stack overflow you're going to end up fighting the results constantly.

Front end pages like a user settings page? Done. One shottable.

Nuanced data migration problems specific to your stack? You're going to be yelling at the agent.

> LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were.

A bit harsh considering that many of us used knowledge bases like SO for so long to figure out new problems that we were confronting.

  • > Front end pages like a user settings page? Done. One shottable.

    This is only one shottable if you are high paced startup or you don't care enough. In real world software, you would need to make it accessible, store data in a complaint way, hook up translations, make sure all inputs are validated and do some usability testing.

  • From my heavy experience using every frontier model for a year now, LLMs are actually probably much, much better at nuanced data migration problems specific to your stack than at a frontend user settings page. (Though still pretty good at both. And the user settings page will work, sure.)

"And it made me think - why are these people so insistent, and hostile? Why can't they live and let live? Why do they need to convince the rest of us?"

Same could be said about the anti-AI crowd.

I'm glad the author made the distinction that he's talking about LLMs, though, because far too many people these days like to shout from the rooftops about all AI being bad, totally ignoring (willfully or otherwise) important areas it's being used in like cancer research.