I think this article makes a valid point. However, if AI coding is considered gambling, then being a project manager overseeing multiple developers could also be seen as a form of gambling to a certain degree. In reality, there isn't much difference between the two. AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.
I think this is a very good point. We have a natural bias toward human output as there is an illusion of full control - in reality even just from a solo dev perspective you've still got a load of hidden illogical persuasions that are influencing your code and how you approach a problem. AI has its own biases that come out of the nature its training on large unknowable data sets, but I'd argue the 'black box' thinking that comes out that isn't too different to the black box of the human mind. That's not at all to say that AI isn't worse (even if quicker) than top developer talent today writing handwritten code - just that the barrier to getting that level of quality isn't as insurmountable as it might appear.
Only if your AI coding approach is the slot machine approach.
I've ended up with a process that produces very, very high quality outputs. Often needing little to no correct from me.
I think of it like an Age of Empires map. If you go into battle surrounded by undiscovered parts of the map, you're in for a rude surprise. Winning a battle means having clarity on both the battle itself and risks next to the battle.
Dam this is so accurate. As a project manager turned product manager this is so true. You need to estimate a project based on the “pedigree” of your engineers
I think the addiction angle seems to make AI coding more similar to gambling. Some people seem to be disturbingly addicted to agentic coding. Much more so than traditional programming. To the point of doing destructive things like waking up in the middle of the night to check agents. Or giving an agent access to their bank account.
You (in theory) have more control over the quality of the team you are managing, than the quality of the models you are using.
And the quality of code models puts out is, in general, well below the average output of a professional developer.
It is however much faster, which makes the gambling loop feel better. Buying and holding a stock for a few months doesn't feel the same as playing a slot machine.
One difference is those developers are moral subjects who feel bad if they screw up whereas a computer is not a moral subject and can never be held accountable.
You have a lot of control over LLM quality. There is different models available. Even with different effort settings of those models you have different outcomes.
Of course, they don't learn like humans so you can't do the trick of hiring someone less senior but with great potential and then mentor them. Instead it's more of an up front price you have to pay. The top models at the highest settings obviously form a ceiling though.
Framing anything with a common blanket concepts usually fails to apply the same framing to related areas. A lot of things include some gambling, you need to compare how it was before was 'gambling', and how 'not using AI' is also 'gambling', etc.
As @m00x points out "coding is gambling on slot machines, managing developers is betting on race horses."
I don‘t think so. A project manager can give feedback, train their staff, etc. An AI coding model is all you get, and you have to wait until your provider trains a new model before you might see an improvement.
I ssk an AI to play hangman with me and looked at it's reasoning. It didn't just pick a secret word and play a straightforward game of hangman. It continually adjusted the secret word based on the letters I guessed, providing me the "perfect" game of hangman. Not too many of my guesses were "right" and not too many "wrong" and I after a little struggle and almost losing, I won in the end.
It wasn't a real game of hangman, it was flat out manipulation, engagement farming. Do you think it's possible that AI does that in any other situations?
This must be it. So many of our colleagues have been burnt by bad coworkers that they would rather burn everything down than spend another day working with them.
> AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.
Except, one can explain themselves (humans) and their actions can be held to account in the case of any legal issue whereas an AI cannot; making such an entity completely unsuitable for high risk situations.
This typical AI booster comparison has got to stop.
Love that you needed to make it clear that it is humans that can explain themselves..
Employees can only be held accountable with severe malice.
There is a good chance that the person actually responsible (eg. The ceo or someone delegated to be responsible) will soon prefer to have AIs do the work as their quality can be quantified.
As a human, you generally have the opportunity make decent headway in understanding the other humans that you're working with and adjusting your instructions to better anticipate the outputs that they'll return to you. This is almost impossible with AI because of a combination of several factors:
>You are not an AI and do not know how an AI "thinks".
>Even if you come to be able to anticipate an AI's output, you will be undermined by the constant and uncontrollable update schedule imposed on you by AI platforms. Humans only make drastic changes like this under uncommon circumstances, like when they're going through large changes in their life, not as a matter of course.
>However, without this update schedule, problems that were once intractable will likely stay so forever. Humans, on the other hand, can grow without becoming completely unpredictable.
All of this new capability has made me realize that the reason i love programming _isn't_ the same as the OP. I used to think (and tell others) that I loved understanding something deeply, wading through the details to figure out a tough problem. but actually, being able to will anything I can think of into existence is what I love about programming. I do feel for the people who were able to make careers out of falling in love w/ and getting good at picking problems & systems apart, breaking them down, and understanding them fully. I respect the discipline, curiosity, and intellect they have. but I also am elated w/ where things are at/going. this feels absurd to say, but I finally feel like I'm _good_ at programming, which is insane, because I literally haven't written a line of code myself in months, but having tools that can finally match the speed my ideas come to me is intoxicating
> but I finally feel like I'm _good_ at programming, which is insane
Yes, it is insane. You couldn't torture this confession out of me. But that's the drug they're selling you, isn't it? You don't even write code, but you're getting a self-inflated sense of worth. It must be addicting! Of course, whether or not the programs you prompt are actually good surely has no relation to whether you feel they're good, since you're not the one writing them, and apparently were not capable of writing them before so are not qualified to review them very much.
> having tools that can finally match the speed my ideas come to me
Anyone can be an "ideas guy". We laughed at those people, because having ideas is not the hard part. The hard part was in all of the hundreds and thousands of little details that go into building the ideas into something actually worthwhile, and that hasn't changed. LLMs can build an idea into a prototype in a weekend. I am still waiting to see LLMs build an idea into something other people use at scale, once, ever, other than LLM wrappers. Either every person who is all-in on vibes only has ideas that consist of making .md files and publishing them as a "meta agent framework", or LLMs are not actually doing a great job of translating ideas into tangibly useful software.
I disagree with this. I've worked with amazing "ideas guys" who just cranked out customer insights and interesting concepts, and I've worked with lousy ones, who just kinda meandered and never had a focused vision beyond a milquetoast copy of the last thing they saw. There's a real skill to forming good concepts, and it's not a skill everyone has!
I think there's way more nuance to this than you're willing to admit here. There's a significant difference between the guy who thinks "I'm going to make X app to do Y and get loaded." and the person who really understands the details of what they want to create and has a concrete vision of how to shape it.
I think that product shaping and detail oriented vision of how something should work and be used by people is genuinely challenging, wholly aside from the lower level technical skills required to execute it.
This is part of the reason why I wouldn't be surprised at all to see product manager types getting more hands-on, or seeing the software engineering profession evolve into more of a PM/SDE hybrid.
I've felt this exact same way until very recently. But in the end, it's slop that never quite does what it's supposed to. Anthropic is proud of themselves that they brute-forced the world's crappiest C compiler into existence. Guess what, nobody will use it.
One size never fits all. I am old enough to remember what a game changer Spreadsheets (VisiCalc) where. They made the personal computer into a SwissArmy knife for many people that could not justify investing large sums of money into software to solve a niche problem. Until that time PCs simply were not a big thing.
I believe AI will do something similar for programming. The level of complexity in modern apps is high and requires the use of many technologies that most of us cannot remotely claim to be expert in. Getting an idea and getting a prototype will definitely be easier. Production Code is another beast. Dealing with legacy systems etc will still require experts at least for the near future IMHO.
I remember when my dev team included some people using Emacs, some using Eclipse (this was pre-VS Code), and some using IntelliJ.
Developers will always disagree on the best tool for X ... but we should all fear the Luddites who refuse to even try new tools, like AI. That personality type doesn't at all mesh with my idea of a "good programmer".
Because the programming is and was always a means to an end. Obsessing over the specific mechanical act of programming is taking the forest for the trees.
I agree with gp that the speed in which I am able to execute my vision is exhilarating. It is making me love programming again. My side projects, which have been hanging on the wall for years, are actually getting done. And quickly!
The actual act of keying in code is drudgery for me. I've written so much code in so many languages that it is hard not to hate them all. Why the fuck is it a hash in ruby but a dict in python? How the hell do I get the current unixtime in this language again?!? Why the fuck do I need to learn yet another stupid vocabulary for what is essentially databinding? Who cares, let the AI handle it
I think this is a semantics thing. I feel the same way, but I wouldn't say that I feel like I'm good at programming. I'm most certainly not. What I am good at is product design and development, and LLM tech has made it so that I can concentrate on features, business models, and users.
Well for one, programming actually sucks. Punching cards sucks. Copywriting sucks. Why? Well, implementation for the sake of implementation is nothing more than self-gratifying, and sole focus on it is an academic pursuit. The classic debate of which programming language is better is an argument of the best way to translate human ideas of logic into something that works. Sure programming is fun but I don't want to do it. What I do want to do is transform data or information into other kinds of information, and computing is a very, very convenient platform to do so, and programming allows manipulation of a substrate to perform such transformations.
I agree with OP because the journey itself rarely helps you focus on system architecture, deliverable products and how your downstream consumers use your product. And not just product in the commercial sense, but FOSS stuff or shareware I slap together because I want to share a solution to a problem with other people.
The gambling fallacy is tiresome as someone who, at least I believe, can question the bullshit models try to do sometimes. It is very much gambling for CEOs, idea men who do not have a technical floor to question model outputs.
If LLMs were /slow/ at getting a working product together combined with my human judgement, I wouldn't use them.
So, when I encounter someone who doesn't pin value into building something that performs useful work, only the actual journey of it, regardless of usefulness of said work, I take them as seriously as an old man playing with hobby trains. Not to disparage hobby trains, because model trains are awesome, but they are hubris.
I see alot of people get really confused between the act of writing code VS. programming...
Programming is willing the machine to do something... Writing code is just that writing code, yes sometimes you write code to make the machine do something and other times you write code just to write code ( for example refactoring, or splitting logic from presentation etc.)
Think about it like this... Everyone can write words. But writing words does not make you a book writer.
What always gets me is that the act of writing code by itself has no real value. Programming is what solves problems and brings value. Everyone can write code, not everyone can "program"....
In my corner of the world, average software developers at Tokyo companies, not that many people are actually using Claude Code for their day-to-day work yet. Their employers have rolled it out and actively encourage adoption, but nobody wants to change how they work.
This probably won't surprise anyone familiar with Japanese corporate culture: external pressure to boost productivity just doesn't land the same way here. People nod, and then keep doing what they've always done.
It's a strange scene to witness, but honestly, I'm grateful for it. I've also been watching plenty of developers elsewhere get their spirits genuinely crushed by coding agents, burning out chasing the slot machine the author describes. So for now, I'm thankful I still get to see this pastoral little landscape where people just... write their own code.
Indeed it was (I was listening to it while stumbling across this post). Also, fun fact: The Gambler was written by Don Schlitz while working as a Computer Operator in 76' which makes it all the more relevant [1].
I'd emphasize that prompting LLMs to generate code isn't just metaphorical gambling in the sense of "taking a risk", the scary part is the more-literal gambling involving addictive behaviors and how those affect the way the user interacts with the machine and the world.
Heck, this technology also offers a parasocial relationship at the same time! Plopping tokens into a slot-machine which also projects a holographic "best friend" that gives you "encouragement" would fit fine in any cyberpunk dystopia.
It’s variable rewards and even with large models the same question can lead to dramatically different answers. Possibly because they route your request through different models. Possibly because the model has more time to dig through the problem. Nonetheless we have some illusion of control over the output (you we wouldn’t be playing it) but it is just the quality of the model itself that leads to better outcomes - not your input. If you can’t let go of the feeling thought, it’s definitely addictive. And as I look back, it’s a fast iteration on the building cycle we had before AI. But the brain really likes low latency - it is addicted to the fast reward for its actions. So AI, if it gets fast enough (sub 400ms) it will likely become irreversibly addictive to humans in general, as the brain will see is at part of itself. Hope it has our interest at heart by then.
Well said! My only qualm with this is saying you hope "it" has our interests at heart. "It" is a machine made by humans that work for corporations. I would correct your hope to, "I hope they have our interest at heart by then."
Side note, everyone's talking about having AI agents "conform to the spec" these days. Am I in my own bubble, or - who the hell these days gets The Spec as a well-formed document? Let alone a good document, something that can be formally verified, thouroughly test-cased, can christen the software "complete" when all its boxes are ticked, etc.?
This seems like 1980's corporate waterfall thinking, doesn't jibe with the messy reality I've seen with customers, unclear ideas, changing market and technical environments, the need for iteration and experimentation, mid-course correction, etc.
Why do you often need to re-prompt things like "can you simplify this and make it more human readable without sacrificing performance?". No amount of specification addresses this on the first shot unless you already know the exact implementation details in which case you might as well write it yourself directly.
I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worthy of being git commit.
I sometimes use AI for tiny standalone functions or scripts so we're not talking about a lot of deeply nested complexity here.
> I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worth of being git commit.
Are you stuck entering your prompts in manually or do you have it setup like a feedback loop like "beautify -> check beauty -> in not beautiful enough beautify again"? I can't imagine why everyone things AIs can just one shot everything like correctness, optimization, and readability, humans can't one shot these either.
There's two secret sauces to making Claude Code your b* (please forgive me future AI overlords), one is to create a spec, the other is to not prompt merely "what" you want and only what you want, but what you want, HOW you want it done (you can get insanely detailed or just vague enough), and even in some cases the why is useful to know and understand, WHO its for sometimes as well. Give it the context you know, don't know anything about the code? Ask it to read it, all of it, you've got 1 million tokens, go for it.
I have one shot prompted projects from empty folder to full feature web app with accounts, login, profiles, you name it, insanely stable, maybe and oops here or there, but for a non-spec single prompt shot, that's impressive.
When I don't use a tool to handle the task management I have Claude build up a markdown spec file for me and specify everything I can think of. Output is always better when you specify technology you want to use, design patterns.
That can't be the whole story, right? Because there are an arbitrarily large number of (e.g.) Rust programs that will implement any given spec given in terms of unit tests, types, and perhaps some performance benchmarks.
But even accounting for all these "hard" constraints and metrics, there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints and perform equally on all relevant metrics.
We do treat programs as efficient causes[1] of side effects in computing systems: a file is written, a block of memory is updated, etc. and the program is the cause of this.
But we also treat them as statements of a theory of the problem being solved[2]. And this latter treatment is often more important socially and economically. It is irrational to be indifferent to the theory of the problem the program expresses.
> there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints
Maintainability is a big one missing from the current LLM/agentic workflow.
When business needs change, you need to be able to add on to the existing program.
We create feedback loops via tests to ensure programs behave according to the spec, but little to nothing in the way of code quality or maintainability.
AI: "Yes, the specs are perfectly clear and architectural standards are fully respected."
[Imports the completely fabricated library docker_quantum_telepathy.js and calls the resolve_all_bugs_and_make_coffee() method, magically compiling the code on an unplugged Raspberry Pi]
AI: "Done! The production deployment was successful, zero errors in the logs, and the app works flawlessly on the first try!"
The endless next steps of "and add this feature" or "this part needs to work differently" or "this seems like a bug?" or "we must speed up this part!" is where 98% of the effort always was.
Personally, I get a huge rush of dopamine seeing LLMs build out complex features very quickly to the point that it will keep me up all night wanting to push further and further.
That's where the gambling metaphor really resonates. It's not whether or not the output is correct, I've been building software for many years and I know how direct LLMs pretty well at this point. But I'm also an alcoholic in recovery and I know that my brain is wired differently than most. And using LLMs has tested my ability to self-regulate in ways that I haven't dealt with since I deleted social media years ago.
It also doesn’t help that producing features is also wired to a sense of monetary compensation. More-so if you’re building a product to sell that might finally be your ticket to whatever your perception of socio-economic victory is.
The gambling metaphor often applied to vibecoding implies that the outcome cannot be fully controlled or influenced, such as a slot machine. Opus 4.5 and beyond show that it not only can be very much can be influenced, but also it can give better results more consistently with the proper checks and balances.
Yeah, I don't think the metaphor applies exactly but I definitely see similarities from my personal experience
1/ Dependency -- Once I got used to agentic coding, I almost always reached out to it even for small changes (e.g. update a yaml config)
2/ Addiction -- In the initial euphoria phase, many people experience not wanting to "waste" any time agent idle and they'd try to assign AI agents task before they go to sleep
3/ You trust your judgement less and less as agent takes over your code
4/ "Slot machine" behavior -- running multiple AI agents parallel on same task in hope of getting some valuable insight from either
5/ Psychosis -- We have all met crypto traders who'd tell you how your 9-5 is stupid and you could be making so much trading NFTs. Social media if full of similar anecodotes these days in regards to vibecoding with people boasting their Claude spend, LOC and what not
I don't think that difference matters to the comparison.
It's not an inherent feature to slot machines, it's something we enforce because people got angry about the outcomes (i.e. fraud) when they didn't operate that way.
It doesn't matter because a dodgy slot-machine is still a slot machine, and the person using it would still be a gambler.
Fascinating how HN is torn about vibe coding still. Everybody pretty much agrees that it works for some use cases, yet there is a flamewar (I mean, cultured, HN-type one) every time. People seem to be more comfortable in a binary mindset.
It’s just how discussion on the internet works, for basically anything at all worth discussing. It’s exhausting, but I can hardly blame HN specifically.
> Everybody pretty much agrees that it works for some use cases
That isn't true, which is the exact reason why people have a binary mindset. More than once on Hacker News I've had people accuse me of being an AI booster just because I said I had success with agents and they did not.
Exactly. It's not gambling if you win most of the time. This is like saying driving a car is gambling. I mean sure, I guess if you think any amount of risk equals gambling.
I don't know where I'd draw the line personally, but wherever you draw it there's a problem. If you give increasingly more advanced tasks to it, you will eventually cross the line.
I think somebody like Nate Silver might say “everything is gambling” if you really pressed them.
A big theme of software development for me has been finishing things other people couldn’t finish and the key to that is “control variance and the mean will take care of itself”
Alternately the junior dev thinks he has a mean of 5 min but the variance is really 5 weeks. The senior dev has mean of 5 hours and a variance of 5 hours.
What?? Surely once these companies have locked in their Claude workflows claude wouldn't somehow raise the price. Or steal inventions like Amazon does. Surely.
Assigning work to an intern is gambling: they're inherently non-deterministic and it's a roll of the dice whether the work they do will be good enough or you'll have to give them feedback in order to get to what you need.
1. Interns learn. LLMs only get better when a new model comes out, which will happen (or not) regardless of whether you use them now.
2. Who here thinks that having interns write all/almost all of your code and moving all your mid level and senior developers to exclusively reviewing their work and managing them is a good idea?
I don't know that the "humans learn, LLMs don't" argument holds any more with coding agents.
Coding agents look at existing text in the codebase before they act. If they previously used a pattern you dislike and you tell them how to do differently, the next time they run they'll see the new pattern and are much more likely to follow that example.
There are fancier ways of having them "learn" - self-updating CLAUDE.md files, taking notes in a notes/ folder etc - but just the code that they write (and can later read in future sessions) feels close-enough to "learning" to me that I don't think it makes sense to say they don't learn any more.
That’s very true. But interns aren’t supposed to be doing useful work. The purpose of interns is training interns and identifying people who might become useful at a later date.
I’ve never worked anywhere where the interns had net productivity on average.
exactly where my mind went as well. There aren't really levels to pulling a lever on a slot machine, other than the ability for each pull to result in more "plays" of the same potential outcome.
The reason i think this metaphor keeps popping up, is because of how easy it is to just hit a wall and constantly prompt "its not working please fix it" and sometimes that will actually result in a positive outcome. So you can choose to gamble very easily, and receive the gambling feedback very quickly unlike with an intern where the feedback loop is considerably delayed, and the delayed interns output might simply be them screaming that they don't understand.
The first is equating human and LLM intelligence. Note that I am not saying that humans are smarter than LLMs. But I do believe that LLMs represent an alien intelligence with a linguistic layer that obscures the differences. The thought processes are very different. At top AI firms, they have the equivalent of Asimov's Susan Calvin trying to understand how these programs think, because it does not resemble human cognition despite the similar outputs.
The second and more important is the feedback loop. What makes gambling gambling is you can smash that lever over and over again and immediately learn if you lost or got a jackpot. The slowness and imprecision of human communication creates a totally different dynamic.
To reiterate, I am not saying interns are superior to LLMs. I'm just saying they are fundamentally different.
And, if we're being honest, the way people talk about interns is weirdly dehumanizing, and the fact that they are always trotted out in these AI debates is depressing.
> And, if we're being honest, the way people talk about interns is weirdly dehumanizing, and the fact that they are always trotted out in these AI debates is depressing.
Yeah, I agree with that.
That thought crossed my mind as I was posting this comment, but I decided to go with it anyway because I think this is one of those cases where I think the comparison is genuinely useful.
We delegate work to humans all the time without thinking "this is gambling, these collaborators are unreliable and non-deterministic".
As someone who has worked with interns for year, expect feedback and reiterations always, be surprised if they get it the first time... which merits feedback as well!
But looks like the intern mafia is bombarding you with downvotes.
Life is full of variable reward schemes. Probably why we evolved to be so enamoured by them.
Sometimes I think we put the Carr before the horse. We gamble because evolution promotes that approach.
Yes I could go for the reliable option. But taking a punt is worth a shot if the cost is low.
The cost of AI is low.
What is a problem is people getting wrapped up in just one more pull of the slot machine handle.
I use AI often. But fairly often I simply bin its reponse and get to work on my own. A decent amount of the time I can work with the response given to make a decent result.
Sometimes, rarely, it gives me what I need right off the bat.
If we're only talking about money spent on prompting AI, maybe. The damage to online trust is massive imo. So is the damage done by looting the commons to build them.
Typical privatize the profits socialize the costs bullshit
You need to collect food, do you go to where you know there are berries (low value but high likelihood of finding), or scout off to find a herd of deer? (High value but low likelihood of finding).
Looking for deer wouldnt be walking off in a random direction. You check water holes, known clearings, known fields.
Each of these is an operation (walk to X and look), each has a low probability of meeting a deer.
This is a variable reward scheme.
The result is optmize foraging practices - you mostly hunt for deer then fall back to berries. In larger groups some will gather berries some will hunt.
Contrary to popular thought hunter and gatherer were not separate occupations.
I was just thinking about this. I was reading those tweets about the SV party were people were going home early to “check on their agents” or the “token anxiety” people are having over whether they are optimizing their agent usage. This is all giving me addiction vibes. Especially at the end of the day it seems like there is not much to show for it.
> But now either the AI can handle it or it can pretend to handle it. Frankly it's pretending both times, but often it's enough to get the result we need.
This has been how I think about it, too. The success rates are going up, but I still view the AI as an adversary that is trying to trick me into thinking it's being useful. Often the act is good enough to be actually useful, too.
When a code doesn't compile, it doesn't kill anyone. But if a Waymo suddenly veers off the road, it creates a real threat. Waymos had to be safer than real human drivers for people to begin to trust them. Coding tools did not have to be better than humans for them to be adopted first. Its entirely possible for a human to make a catastrophic error. I imagine in the future, it will be more likely that a human makes such errors, just like its more likely that a human will make more errors driving a car.
My understanding is that waymo has gone on the record to say that they have human operators that remotely drive the vehicle in scenarios where their automated system is confused.
Which I assert is semantically equivalent to saying: Human drivers (even when operating at the diminished capacity of not even being present in the car) are less likely to make errors driving a car than AIs.
This is getting off topic but they did not say the remote humans drive the cars. The cars always drive themselves, the remote humans provide guidance when the car is not confident in any of the decisions it could make. The humans define a new route or tell the car it's ok to proceed forward
If I said I had a machine where I put in "tokens", watch it spin, and either get nothing or something valuable (with which I get being largely chance), you'd presume it's some kind of slot machine. The important things IMO are the random chance of getting something and being able to keep retrying so rapidly.
You can't keep paying to play the "refinancing game" until you get a good rate (at least not like pulling the lever again and again, you have to wait a long time, you won't call the same bank again and again, and suddenly they have an amazing rate), it's a different experience and the psychology is different.
I have had very similar experiences. I am not a professional software developer, but have been a Linux sysadmin for over a decade, a web developer for much longer than that, and generally know enough to hack on other people’s projects to make them suit my own purposes.
When I have Claude create something from scratch, it all appears very competent, even impressive, and it usually will build/function successfully…on the surface. I have noticed on several occasions that Claude has effectively coded the aesthetics of what I want, but left the substance out. A feature will appear to have been implemented exactly as I asked, but when I dig into the details, it’s a lot of very brittle logic that will almost certainly become a problem in future.
This is why I refuse to release anything it makes for me. I know that it’s not good enough, that I won’t be able to properly maintain it, and that such a product would likely harm my reputation, sooner or later. What frightens me is there are a LOT of people who either don’t know enough to recognize this, or who simply don’t care and are looking for a quick buck. It’s already getting significantly more difficult to search for software projects without getting miles of slop. I don’t know how this will ultimately shake out, but if it’s this bad at the thing it’s supposedly good at, I can only imagine the kinds of military applications being leveraged right now…
few thoughts on this- it's not gambling if the most expected outcome actually occurs.
It also depends on what you're coding with;
- If you're coding with opus4.6, then it's not gambling for a while.
- If you'r coding with gemini3-flash, then yeah.
One thing I have noticed though is- you have to spend a lot of tokens to keep the error/hallucination rate low as your codebase increases in size. The math of this problem makes sense; as the code base has increased, there's physically more surface where something could go wrong. To avoid that you have to consistently and efficiently make the surface and all it's features visible to the model. If you have coded with a model for a week and it has produced some code, the model is not more intelligent after that week- it still has the same layers and parameters, so keeping the context relevant is a moving target as the codebase increases (and that's why it probably feels like gambling to some people).
> it's not gambling if the most expected outcome actually occurs.
> you have to spend a lot of tokens to keep the error/hallucination rate low
Ironically, I find your comment more effective at convincing me AI coding is gambling than the original article. You're talking about it the exact same way that gamblers do about their games.
so your whole argument is that you are convinced that ai coding is gambling because according to you i am talking about it like gamblers talk about gambling?
- Was there anymore intelligence that you wanted to add to your argument?
An idea just occurred to me: why not tell AI to code in Coq? AFAIK the selling point of that language is that if it compiles, then it's guaranteed to work. It's just that it's PITA to write code in Coq, but AI won't get annoyed and quit.
- One shot or "spray and pray" prompt only vibe coding: gambling.
- Spec driven TDD AI vibe coding: more akin to poker.
- Normal coding (maybe with tab auto complete): eating veggies/work.
Notably though gambling has the massive downside of losing your entire life and life savings. Being in the "vibe coding" bucket's worse case is being insufferable to your friends and family, wasting your time, and spending $200/month on a max plan.
I really hate when people write about the AI of the past, opus 4.6 and gpt 5.4 [not as much imo, it's really boring and uncreative] have increased in capabilities so much that it's honestly mind numbing compared to what we had LESS than a year ago.
Opus specifically from 4.1 to 4.5 was such a major leap that some take it for granted, it went from getting stuck in loops, generally getting lost constantly, needing so so much attention to keep it going to being able to get a prompt, understand it from minimal context and produce what you wanted it to do. Opus 4.6 was a slight downgrade since it has issues with respecting what the user has to say.
I mean, this completely falls apart when you're trying to do something "real". I am building a trading engine right now with Claude/Codex. I have not written a line of code myself. However I care deeply about making sure everything works well because it's my money on the line. I have to weight carefully the prospect of landing a change that I don't fully understand.
Sometimes I can get away with 3K LoC PRs, sometimes I take a really long time on a +80 -25 change. You have to be intellectually honest with yourself about where to spend your time.
As always, scope the changes to no larger than you can verify. AI changes the scale, but not the strategy.
Now you have more resources to test, reduce permissions scope, to build a test bench & procedure. All of the excuses you once had for not doing the job right are now gone.
You can write 10k + lines of test code in a few minutes. What is the gamble? The old world was a bigger gamble.
it's gambling until you learn how to set up proper harnesses then it just becomes normal administration. It's no different than running a team, humans make mistakes too, that's why we have CI pipelines, automated testing etc... AI assisted coding "JUST" requires you to be extra good at that part of the job.
For me, the feedback loop accelerating the way that AI now permits is so addictive in my day-to-day flows. I've had a really hard time stepping away from work at a reasonable hour because I get dopamine hits seeing Claude build things so fast.
Addiction and recovery is part of my story, so I've done quite a bit of work around that part of my life. I don't gamble, but I can confidently say that using LLMs has been an incredible boost in my productivity while completely destroying my good habits around setting boundaries, not working until 2AM, etc.
It is indeed gambling. You are spending more tokens hoping that the agent aligns with your desired output from your prompt. Sometimes it works, sometimes it doesn't.
Watching vibe gamblers hooked onto coding agents who can't solve fizz buzz in Rust are given promotional offers by Anthropic [0] for free token allowances that are the equivalent in the casino of free $20 bets or free spins at the casino to win until March 27, 2026.
coding with an LLM works if the model you are following is: you have the role of architect and/or senior developer, and you have the smartest junior programmer in the world working for you. You watch everything it does, check its conclusions, challenge it, call it out on things it didnt get quite right
it's really extremely similar to working with a junior programmer
so in this post, where does this go wrong?
> I am not your average developer. I’ve never worked on large teams and I’ve barely started a project from scratch. The internet is filled with code and ideas, most of it freely available for you to fork and change.
Because this describes a cut-and-paster, not a software architect. Hence the LLM is a gambling machine for someone like this since they lack the wisdom to really know how to do things.
There's of course a huge issue which is that how are we going to get more senior/architect programmers in the pipeline if everyone junior is also doing everything with LLMs now. I can't answer that and this might be the asteroid that wipes out the dinosaurs....but in the meantime, if you DO know how to write from scratch and have some experience managing teams of programmers, the LLMs are super useful.
> it's really extremely similar to working with a junior programmer
Right, which is why LLMs aren't useful if you actually know what you're doing. It's a drain on your time to have to carefully check everything a junior writes, but you do it because he will learn and eventually return on that investment. With an LLM, there is no such long term payoff.
This "slot machine" metaphor is played out. If you're just entering a coin's worth of information and nudging it over and over in the hopes of getting something good, that's a you problem, not a Claude problem.
If, on the other hand, you treat it like a hyper-competent collaborator, and follow good project management and development practices, you're golden.
> If, on the other hand, you treat it like a hyper-competent collaborator, and follow good project management and development practices, you're golden.
I am consistently using 100% of my weekly $200 max plan. I know how this thing works, I know how to get value out of it, and I wish what you said were true.
If you do all of these things? You are in a better spot. You are in a far better spot than if you hadn't! Setting up hooks to ensure notes get written? Massive win! Red-green TDD? Yes, please! But in terms of just ... well, being able to rely on the damn thing?
I see whole teams pushed by c- level going full in with spec driven + tdd development. The devs hate it because they are literally forbidden to touch a single line if code. but the results speak for themselves, it just works and the pressure has shifted to the product people to keep up. The whole tooling to enable this had to be worked out first. All Cursor and extreme use of a tool called Speckit, connected to Notion to pump documentation and Jira.
> But this doesn't really resemble coding. An act that requires a lot of thinking and writing long detailed code.
Does it? It did in the past. Now it doesn't. Maybe "add a button to display a colour selector" really is the canonical way to code that feature, and the 100+ lines of generated code are just a machine language artifact like binary.
> But it robs me of the part that’s best for the soul. Figuring out how this works for me, finding the clever fix or conversion and getting it working. My job went from connecting these two things being the hard and reward part, to just mopping up how poorly they’ve been connected.
Skill issue. Two nights ago, I used Claude to write an iOS app to convert Live Photos into gifs. No other app does it well. I'm going to publish it as my first app. I wouldn't have bothered to do it without AI, and my soul feels a lot better with it.
I think this article makes a valid point. However, if AI coding is considered gambling, then being a project manager overseeing multiple developers could also be seen as a form of gambling to a certain degree. In reality, there isn't much difference between the two. AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.
I think this is a very good point. We have a natural bias toward human output as there is an illusion of full control - in reality even just from a solo dev perspective you've still got a load of hidden illogical persuasions that are influencing your code and how you approach a problem. AI has its own biases that come out of the nature its training on large unknowable data sets, but I'd argue the 'black box' thinking that comes out that isn't too different to the black box of the human mind. That's not at all to say that AI isn't worse (even if quicker) than top developer talent today writing handwritten code - just that the barrier to getting that level of quality isn't as insurmountable as it might appear.
AI coding is gambling on slot machines, managing developers is betting on race horses.
Only if your AI coding approach is the slot machine approach.
I've ended up with a process that produces very, very high quality outputs. Often needing little to no correct from me.
I think of it like an Age of Empires map. If you go into battle surrounded by undiscovered parts of the map, you're in for a rude surprise. Winning a battle means having clarity on both the battle itself and risks next to the battle.
5 replies →
Dam this is so accurate. As a project manager turned product manager this is so true. You need to estimate a project based on the “pedigree” of your engineers
What is it with you guys and stallions?
1 reply →
Great analogy, I’m saving it!
I think the addiction angle seems to make AI coding more similar to gambling. Some people seem to be disturbingly addicted to agentic coding. Much more so than traditional programming. To the point of doing destructive things like waking up in the middle of the night to check agents. Or giving an agent access to their bank account.
I mean, it’s just so fun. Claude wrote a native macOS app for me today.
I don’t think I’d describe my behavior as destructive though
I know at least one case where the obsession with agents ruined a marriage.
You (in theory) have more control over the quality of the team you are managing, than the quality of the models you are using.
And the quality of code models puts out is, in general, well below the average output of a professional developer.
It is however much faster, which makes the gambling loop feel better. Buying and holding a stock for a few months doesn't feel the same as playing a slot machine.
What theory is that?
My experience is the absolute opposite. I am much more in control of quality with Ai agents.
I am never letting junior to midlevels into my team again.
In fact, I am not sure I will allow any form of manual programming in a year or so.
2 replies →
One difference is those developers are moral subjects who feel bad if they screw up whereas a computer is not a moral subject and can never be held accountable.
https://simonwillison.net/2025/Feb/3/a-computer-can-never-be...
1 reply →
You have a lot of control over LLM quality. There is different models available. Even with different effort settings of those models you have different outcomes.
E.g. look at the "SWE-Bench Pro (public)" heading in this page: https://openai.com/index/introducing-gpt-5-4/ , showing reasoning efforts from none to high.
Of course, they don't learn like humans so you can't do the trick of hiring someone less senior but with great potential and then mentor them. Instead it's more of an up front price you have to pay. The top models at the highest settings obviously form a ceiling though.
1 reply →
Framing anything with a common blanket concepts usually fails to apply the same framing to related areas. A lot of things include some gambling, you need to compare how it was before was 'gambling', and how 'not using AI' is also 'gambling', etc.
As @m00x points out "coding is gambling on slot machines, managing developers is betting on race horses."
I don‘t think so. A project manager can give feedback, train their staff, etc. An AI coding model is all you get, and you have to wait until your provider trains a new model before you might see an improvement.
I ssk an AI to play hangman with me and looked at it's reasoning. It didn't just pick a secret word and play a straightforward game of hangman. It continually adjusted the secret word based on the letters I guessed, providing me the "perfect" game of hangman. Not too many of my guesses were "right" and not too many "wrong" and I after a little struggle and almost losing, I won in the end.
It wasn't a real game of hangman, it was flat out manipulation, engagement farming. Do you think it's possible that AI does that in any other situations?
That says more about how you see developers than whether or not managers are in a sense gamblers.
This must be it. So many of our colleagues have been burnt by bad coworkers that they would rather burn everything down than spend another day working with them.
> AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.
Except, one can explain themselves (humans) and their actions can be held to account in the case of any legal issue whereas an AI cannot; making such an entity completely unsuitable for high risk situations.
This typical AI booster comparison has got to stop.
Love that you needed to make it clear that it is humans that can explain themselves..
Employees can only be held accountable with severe malice.
There is a good chance that the person actually responsible (eg. The ceo or someone delegated to be responsible) will soon prefer to have AIs do the work as their quality can be quantified.
> Except, one can explain themselves (humans) and their actions can be held to account in the case of any legal issue whereas an AI cannot
You "own" the software it creates which means you're responsible for it. If you use AI to commit crimes you'll go to jail, not the AI.
As a human, you generally have the opportunity make decent headway in understanding the other humans that you're working with and adjusting your instructions to better anticipate the outputs that they'll return to you. This is almost impossible with AI because of a combination of several factors:
>You are not an AI and do not know how an AI "thinks".
>Even if you come to be able to anticipate an AI's output, you will be undermined by the constant and uncontrollable update schedule imposed on you by AI platforms. Humans only make drastic changes like this under uncommon circumstances, like when they're going through large changes in their life, not as a matter of course.
>However, without this update schedule, problems that were once intractable will likely stay so forever. Humans, on the other hand, can grow without becoming completely unpredictable.
It's a Catch-22. AI is way closer to gambling.
All of this new capability has made me realize that the reason i love programming _isn't_ the same as the OP. I used to think (and tell others) that I loved understanding something deeply, wading through the details to figure out a tough problem. but actually, being able to will anything I can think of into existence is what I love about programming. I do feel for the people who were able to make careers out of falling in love w/ and getting good at picking problems & systems apart, breaking them down, and understanding them fully. I respect the discipline, curiosity, and intellect they have. but I also am elated w/ where things are at/going. this feels absurd to say, but I finally feel like I'm _good_ at programming, which is insane, because I literally haven't written a line of code myself in months, but having tools that can finally match the speed my ideas come to me is intoxicating
> but I finally feel like I'm _good_ at programming, which is insane
Yes, it is insane. You couldn't torture this confession out of me. But that's the drug they're selling you, isn't it? You don't even write code, but you're getting a self-inflated sense of worth. It must be addicting! Of course, whether or not the programs you prompt are actually good surely has no relation to whether you feel they're good, since you're not the one writing them, and apparently were not capable of writing them before so are not qualified to review them very much.
> having tools that can finally match the speed my ideas come to me
Anyone can be an "ideas guy". We laughed at those people, because having ideas is not the hard part. The hard part was in all of the hundreds and thousands of little details that go into building the ideas into something actually worthwhile, and that hasn't changed. LLMs can build an idea into a prototype in a weekend. I am still waiting to see LLMs build an idea into something other people use at scale, once, ever, other than LLM wrappers. Either every person who is all-in on vibes only has ideas that consist of making .md files and publishing them as a "meta agent framework", or LLMs are not actually doing a great job of translating ideas into tangibly useful software.
> Anyone can be an "ideas guy".
I disagree with this. I've worked with amazing "ideas guys" who just cranked out customer insights and interesting concepts, and I've worked with lousy ones, who just kinda meandered and never had a focused vision beyond a milquetoast copy of the last thing they saw. There's a real skill to forming good concepts, and it's not a skill everyone has!
1 reply →
>Anyone can be an "ideas guy".
I think there's way more nuance to this than you're willing to admit here. There's a significant difference between the guy who thinks "I'm going to make X app to do Y and get loaded." and the person who really understands the details of what they want to create and has a concrete vision of how to shape it.
I think that product shaping and detail oriented vision of how something should work and be used by people is genuinely challenging, wholly aside from the lower level technical skills required to execute it.
This is part of the reason why I wouldn't be surprised at all to see product manager types getting more hands-on, or seeing the software engineering profession evolve into more of a PM/SDE hybrid.
I've felt this exact same way until very recently. But in the end, it's slop that never quite does what it's supposed to. Anthropic is proud of themselves that they brute-forced the world's crappiest C compiler into existence. Guess what, nobody will use it.
One size never fits all. I am old enough to remember what a game changer Spreadsheets (VisiCalc) where. They made the personal computer into a SwissArmy knife for many people that could not justify investing large sums of money into software to solve a niche problem. Until that time PCs simply were not a big thing.
I believe AI will do something similar for programming. The level of complexity in modern apps is high and requires the use of many technologies that most of us cannot remotely claim to be expert in. Getting an idea and getting a prototype will definitely be easier. Production Code is another beast. Dealing with legacy systems etc will still require experts at least for the near future IMHO.
I remember when my dev team included some people using Emacs, some using Eclipse (this was pre-VS Code), and some using IntelliJ.
Developers will always disagree on the best tool for X ... but we should all fear the Luddites who refuse to even try new tools, like AI. That personality type doesn't at all mesh with my idea of a "good programmer".
1 reply →
> but I finally feel like I'm _good_ at programming, which is insane, because I literally haven't written a line of code myself in months
This is exactly the sort of mentality that makes me hate this technology
You finally feel good at programming despite admitting that you aren't actually doing it
Please explain why anyone should take this seriously?
Because the programming is and was always a means to an end. Obsessing over the specific mechanical act of programming is taking the forest for the trees.
I agree with gp that the speed in which I am able to execute my vision is exhilarating. It is making me love programming again. My side projects, which have been hanging on the wall for years, are actually getting done. And quickly!
The actual act of keying in code is drudgery for me. I've written so much code in so many languages that it is hard not to hate them all. Why the fuck is it a hash in ruby but a dict in python? How the hell do I get the current unixtime in this language again?!? Why the fuck do I need to learn yet another stupid vocabulary for what is essentially databinding? Who cares, let the AI handle it
13 replies →
I think this is a semantics thing. I feel the same way, but I wouldn't say that I feel like I'm good at programming. I'm most certainly not. What I am good at is product design and development, and LLM tech has made it so that I can concentrate on features, business models, and users.
Well for one, programming actually sucks. Punching cards sucks. Copywriting sucks. Why? Well, implementation for the sake of implementation is nothing more than self-gratifying, and sole focus on it is an academic pursuit. The classic debate of which programming language is better is an argument of the best way to translate human ideas of logic into something that works. Sure programming is fun but I don't want to do it. What I do want to do is transform data or information into other kinds of information, and computing is a very, very convenient platform to do so, and programming allows manipulation of a substrate to perform such transformations.
I agree with OP because the journey itself rarely helps you focus on system architecture, deliverable products and how your downstream consumers use your product. And not just product in the commercial sense, but FOSS stuff or shareware I slap together because I want to share a solution to a problem with other people.
The gambling fallacy is tiresome as someone who, at least I believe, can question the bullshit models try to do sometimes. It is very much gambling for CEOs, idea men who do not have a technical floor to question model outputs.
If LLMs were /slow/ at getting a working product together combined with my human judgement, I wouldn't use them.
So, when I encounter someone who doesn't pin value into building something that performs useful work, only the actual journey of it, regardless of usefulness of said work, I take them as seriously as an old man playing with hobby trains. Not to disparage hobby trains, because model trains are awesome, but they are hubris.
2 replies →
Why do you feel good about programming despite not writing in machine code?
Different definitions of programming.
OP defines it as getting the machine to do as he wants.
You define it as the actual act of writing the detailed instructions.
1 reply →
I see alot of people get really confused between the act of writing code VS. programming...
Programming is willing the machine to do something... Writing code is just that writing code, yes sometimes you write code to make the machine do something and other times you write code just to write code ( for example refactoring, or splitting logic from presentation etc.)
Think about it like this... Everyone can write words. But writing words does not make you a book writer.
What always gets me is that the act of writing code by itself has no real value. Programming is what solves problems and brings value. Everyone can write code, not everyone can "program"....
3 replies →
In my corner of the world, average software developers at Tokyo companies, not that many people are actually using Claude Code for their day-to-day work yet. Their employers have rolled it out and actively encourage adoption, but nobody wants to change how they work.
This probably won't surprise anyone familiar with Japanese corporate culture: external pressure to boost productivity just doesn't land the same way here. People nod, and then keep doing what they've always done.
It's a strange scene to witness, but honestly, I'm grateful for it. I've also been watching plenty of developers elsewhere get their spirits genuinely crushed by coding agents, burning out chasing the slot machine the author describes. So for now, I'm thankful I still get to see this pastoral little landscape where people just... write their own code.
You got to know when to Ship it,
Know when to Re-prompt,
Know when to Clear the Context,
And know when to RLHF.
You never trust the Output,
When you’re staring at the diff view,
There’ll (not) be time enough for Fixing,
When the Tokens are all spent.
> When you’re staring at the diff view,
Bold assumption that people are looking at the diffs at all. They leave that for their coworkers agents.
Will the diffs be small enough for people to even usefully wade through them?
I really hope that was your creativity and not AI
Indeed it was (I was listening to it while stumbling across this post). Also, fun fact: The Gambler was written by Don Schlitz while working as a Computer Operator in 76' which makes it all the more relevant [1].
[1]: https://web.archive.org/web/20230130060050/https://www.rolli...
You're a gamblin' man, I see...
thank you. I knew there was something I was missing
I'd emphasize that prompting LLMs to generate code isn't just metaphorical gambling in the sense of "taking a risk", the scary part is the more-literal gambling involving addictive behaviors and how those affect the way the user interacts with the machine and the world.
Heck, this technology also offers a parasocial relationship at the same time! Plopping tokens into a slot-machine which also projects a holographic "best friend" that gives you "encouragement" would fit fine in any cyberpunk dystopia.
I think AI literally makes even being wrong feel like getting something done. And that is the addictive part for people.
Look at all this text I have! It can't be worthless right?!
[dead]
It’s variable rewards and even with large models the same question can lead to dramatically different answers. Possibly because they route your request through different models. Possibly because the model has more time to dig through the problem. Nonetheless we have some illusion of control over the output (you we wouldn’t be playing it) but it is just the quality of the model itself that leads to better outcomes - not your input. If you can’t let go of the feeling thought, it’s definitely addictive. And as I look back, it’s a fast iteration on the building cycle we had before AI. But the brain really likes low latency - it is addicted to the fast reward for its actions. So AI, if it gets fast enough (sub 400ms) it will likely become irreversibly addictive to humans in general, as the brain will see is at part of itself. Hope it has our interest at heart by then.
Well said! My only qualm with this is saying you hope "it" has our interests at heart. "It" is a machine made by humans that work for corporations. I would correct your hope to, "I hope they have our interest at heart by then."
This is being overlooked, downplayed, or simply not understood, by many commenters.
It is exactly like the proverbial monkey or rat pressing a bar for a food pellet to come out.
If the pellet unerringly drops, and is always tasty and nutritious, the rat stops when it's no longer hungry.
Otherwise, an inordinate amount of time is spent pressing the bar.
It is and will always be about: 1) properly defining the spec 2) ensuring the implementation satisfies said spec
Side note, everyone's talking about having AI agents "conform to the spec" these days. Am I in my own bubble, or - who the hell these days gets The Spec as a well-formed document? Let alone a good document, something that can be formally verified, thouroughly test-cased, can christen the software "complete" when all its boxes are ticked, etc.?
This seems like 1980's corporate waterfall thinking, doesn't jibe with the messy reality I've seen with customers, unclear ideas, changing market and technical environments, the need for iteration and experimentation, mid-course correction, etc.
> properly defining the spec
Why do you often need to re-prompt things like "can you simplify this and make it more human readable without sacrificing performance?". No amount of specification addresses this on the first shot unless you already know the exact implementation details in which case you might as well write it yourself directly.
I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worthy of being git commit.
I sometimes use AI for tiny standalone functions or scripts so we're not talking about a lot of deeply nested complexity here.
> I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worth of being git commit.
Are you stuck entering your prompts in manually or do you have it setup like a feedback loop like "beautify -> check beauty -> in not beautiful enough beautify again"? I can't imagine why everyone things AIs can just one shot everything like correctness, optimization, and readability, humans can't one shot these either.
2 replies →
There's two secret sauces to making Claude Code your b* (please forgive me future AI overlords), one is to create a spec, the other is to not prompt merely "what" you want and only what you want, but what you want, HOW you want it done (you can get insanely detailed or just vague enough), and even in some cases the why is useful to know and understand, WHO its for sometimes as well. Give it the context you know, don't know anything about the code? Ask it to read it, all of it, you've got 1 million tokens, go for it.
I have one shot prompted projects from empty folder to full feature web app with accounts, login, profiles, you name it, insanely stable, maybe and oops here or there, but for a non-spec single prompt shot, that's impressive.
When I don't use a tool to handle the task management I have Claude build up a markdown spec file for me and specify everything I can think of. Output is always better when you specify technology you want to use, design patterns.
That can't be the whole story, right? Because there are an arbitrarily large number of (e.g.) Rust programs that will implement any given spec given in terms of unit tests, types, and perhaps some performance benchmarks.
But even accounting for all these "hard" constraints and metrics, there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints and perform equally on all relevant metrics.
We do treat programs as efficient causes[1] of side effects in computing systems: a file is written, a block of memory is updated, etc. and the program is the cause of this.
But we also treat them as statements of a theory of the problem being solved[2]. And this latter treatment is often more important socially and economically. It is irrational to be indifferent to the theory of the problem the program expresses.
[1]: https://en.wikipedia.org/wiki/Four_causes#Efficient
[2]: https://pages.cs.wisc.edu/~remzi/Naur.pdf
> there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints
Maintainability is a big one missing from the current LLM/agentic workflow.
When business needs change, you need to be able to add on to the existing program.
We create feedback loops via tests to ensure programs behave according to the spec, but little to nothing in the way of code quality or maintainability.
AI: "Yes, the specs are perfectly clear and architectural standards are fully respected."
[Imports the completely fabricated library docker_quantum_telepathy.js and calls the resolve_all_bugs_and_make_coffee() method, magically compiling the code on an unplugged Raspberry Pi]
AI: "Done! The production deployment was successful, zero errors in the logs, and the app works flawlessly on the first try!"
Good sir, have you heard the Good Word of the Waterfall development process? It sounds like that's what you are describing
I had a CIO tell me 15 years ago with Agile I was wasting my time with specs and design documents.
I was in a call just today where specs were presented as a new thing.
Then pulling the lever until it works! You can also code up a little helper to continuously pull the lever until it works!
We have a monkeys and typewriters thing for this already.
Just instead of hitting keys, they’re hitting words, and the words have probability links to each other.
Who the hell thinks this is ready to make important decisions?
Well it’s more how much we care about those.
Which with the advent of LLMs just lowered our standards so we can claim success.
That was always the easy part.
The endless next steps of "and add this feature" or "this part needs to work differently" or "this seems like a bug?" or "we must speed up this part!" is where 98% of the effort always was.
Is it different with AI coding?
Personally, I get a huge rush of dopamine seeing LLMs build out complex features very quickly to the point that it will keep me up all night wanting to push further and further.
That's where the gambling metaphor really resonates. It's not whether or not the output is correct, I've been building software for many years and I know how direct LLMs pretty well at this point. But I'm also an alcoholic in recovery and I know that my brain is wired differently than most. And using LLMs has tested my ability to self-regulate in ways that I haven't dealt with since I deleted social media years ago.
It also doesn’t help that producing features is also wired to a sense of monetary compensation. More-so if you’re building a product to sell that might finally be your ticket to whatever your perception of socio-economic victory is.
1 reply →
> Personally, I get a huge rush of dopamine seeing LLMs build out complex features very quickly
I dont think i've read a sentence on this website i can relate to less.
I watch the LLM build things and it feels completely numb, i may as well be watching paint dry. It means nothing to me.
3 replies →
The gambling metaphor often applied to vibecoding implies that the outcome cannot be fully controlled or influenced, such as a slot machine. Opus 4.5 and beyond show that it not only can be very much can be influenced, but also it can give better results more consistently with the proper checks and balances.
Poker is a skill-based game where your actions influence your success, but many people who play it are gambling.
And that's why poker is a poor metaphor for agentic coding.
3 replies →
Poker has elements of both luck and skill. The luck element + wagering money is what makes it gambling.
1 reply →
everybody who's playing poker is gambling, skilled or not.
1 reply →
Yeah, I don't think the metaphor applies exactly but I definitely see similarities from my personal experience
1/ Dependency -- Once I got used to agentic coding, I almost always reached out to it even for small changes (e.g. update a yaml config)
2/ Addiction -- In the initial euphoria phase, many people experience not wanting to "waste" any time agent idle and they'd try to assign AI agents task before they go to sleep
3/ You trust your judgement less and less as agent takes over your code
4/ "Slot machine" behavior -- running multiple AI agents parallel on same task in hope of getting some valuable insight from either
5/ Psychosis -- We have all met crypto traders who'd tell you how your 9-5 is stupid and you could be making so much trading NFTs. Social media if full of similar anecodotes these days in regards to vibecoding with people boasting their Claude spend, LOC and what not
One way it works is if you think of cognitive debt as the "house". As in "the house always wins".
Slot machines have very controlled results. They are regulated to a high precision of reliability.
I don't think that difference matters to the comparison.
It's not an inherent feature to slot machines, it's something we enforce because people got angry about the outcomes (i.e. fraud) when they didn't operate that way.
It doesn't matter because a dodgy slot-machine is still a slot machine, and the person using it would still be a gambler.
> The gambling metaphor often applied to vibecoding implies that the outcome
The important part of the not-really-a-metaphor is the relationship between user and machine, and how it affects the user's mind.
What the machine outputs on "wins" doesn't matter as much, addictive gambling can still happen even when the payouts are dumb.
it can give better results more consistently with the proper checks and balances.
You can get more consistent results from a slot machine with a bunch of magnets and some swift kicks. It's still gambling.
Fascinating how HN is torn about vibe coding still. Everybody pretty much agrees that it works for some use cases, yet there is a flamewar (I mean, cultured, HN-type one) every time. People seem to be more comfortable in a binary mindset.
It’s just how discussion on the internet works, for basically anything at all worth discussing. It’s exhausting, but I can hardly blame HN specifically.
> Everybody pretty much agrees that it works for some use cases
That isn't true, which is the exact reason why people have a binary mindset. More than once on Hacker News I've had people accuse me of being an AI booster just because I said I had success with agents and they did not.
VIM vs Emacs vs IDE vs..., Tabs vs Spaces, Procedural vs OOP vs Functional.
We love a good holy war for sure.
The nuance is lost, and the conversations we should be having never happen (requirements, hiring/skills, developer experience).
How often do you have to win before it's no longer gambling?
Exactly. It's not gambling if you win most of the time. This is like saying driving a car is gambling. I mean sure, I guess if you think any amount of risk equals gambling.
I don't know where I'd draw the line personally, but wherever you draw it there's a problem. If you give increasingly more advanced tasks to it, you will eventually cross the line.
How is this any different from assigning increasingly more advanced tasks to an employee?
we're winning so much we started complaining "I can't handle so much winning"
Depending on anyone for anything is gambling.
It’s like any powerful tool. If you use it right it’s amazing. If you get careless or don’t watch it closely you’ll get hurt really badly.
Overall I’m a fan, but yes there are things to watch for. It doesn’t replace skilled humans but it does help skilled humans work faster if used right.
The labor replacement story is bullshit mostly, but that doesn’t mean it’s all bad.
Everything is "fast, cheap, good--pick two." This is no different.
I like the analogy but which 2 is AI coding?
Fast & Cheap (but not Good?) - I wouldn't really say that AI coding is "cheap"
Cheap & Good (but not Fast) - Again, not really "cheap"
Fast & Good (but not Cheap) - This seems like maybe where we're at? Is this a bad place?
The proper idiom is "You can only pick two". It doesn't say that everything is two of them, or even one.
It's not cheap or good, it's just fast.
1 reply →
I think somebody like Nate Silver might say “everything is gambling” if you really pressed them.
A big theme of software development for me has been finishing things other people couldn’t finish and the key to that is “control variance and the mean will take care of itself”
Alternately the junior dev thinks he has a mean of 5 min but the variance is really 5 weeks. The senior dev has mean of 5 hours and a variance of 5 hours.
The problem with AI coding is that you no longer own the foundational tools.
What?? Surely once these companies have locked in their Claude workflows claude wouldn't somehow raise the price. Or steal inventions like Amazon does. Surely.
Assigning work to an intern is gambling: they're inherently non-deterministic and it's a roll of the dice whether the work they do will be good enough or you'll have to give them feedback in order to get to what you need.
1. Interns learn. LLMs only get better when a new model comes out, which will happen (or not) regardless of whether you use them now.
2. Who here thinks that having interns write all/almost all of your code and moving all your mid level and senior developers to exclusively reviewing their work and managing them is a good idea?
I don't know that the "humans learn, LLMs don't" argument holds any more with coding agents.
Coding agents look at existing text in the codebase before they act. If they previously used a pattern you dislike and you tell them how to do differently, the next time they run they'll see the new pattern and are much more likely to follow that example.
There are fancier ways of having them "learn" - self-updating CLAUDE.md files, taking notes in a notes/ folder etc - but just the code that they write (and can later read in future sessions) feels close-enough to "learning" to me that I don't think it makes sense to say they don't learn any more.
3 replies →
That’s very true. But interns aren’t supposed to be doing useful work. The purpose of interns is training interns and identifying people who might become useful at a later date.
I’ve never worked anywhere where the interns had net productivity on average.
Replace "intern" with "coworker" and my comment still holds.
1 reply →
exactly where my mind went as well. There aren't really levels to pulling a lever on a slot machine, other than the ability for each pull to result in more "plays" of the same potential outcome.
The reason i think this metaphor keeps popping up, is because of how easy it is to just hit a wall and constantly prompt "its not working please fix it" and sometimes that will actually result in a positive outcome. So you can choose to gamble very easily, and receive the gambling feedback very quickly unlike with an intern where the feedback loop is considerably delayed, and the delayed interns output might simply be them screaming that they don't understand.
There are two major mistakes here.
The first is equating human and LLM intelligence. Note that I am not saying that humans are smarter than LLMs. But I do believe that LLMs represent an alien intelligence with a linguistic layer that obscures the differences. The thought processes are very different. At top AI firms, they have the equivalent of Asimov's Susan Calvin trying to understand how these programs think, because it does not resemble human cognition despite the similar outputs.
The second and more important is the feedback loop. What makes gambling gambling is you can smash that lever over and over again and immediately learn if you lost or got a jackpot. The slowness and imprecision of human communication creates a totally different dynamic.
To reiterate, I am not saying interns are superior to LLMs. I'm just saying they are fundamentally different.
And, if we're being honest, the way people talk about interns is weirdly dehumanizing, and the fact that they are always trotted out in these AI debates is depressing.
> And, if we're being honest, the way people talk about interns is weirdly dehumanizing, and the fact that they are always trotted out in these AI debates is depressing.
Yeah, I agree with that.
That thought crossed my mind as I was posting this comment, but I decided to go with it anyway because I think this is one of those cases where I think the comparison is genuinely useful.
We delegate work to humans all the time without thinking "this is gambling, these collaborators are unreliable and non-deterministic".
1 reply →
The only similarity is that they both say "you’re absolutely right" when you point out their obvious mistakes
You generally don’t assign work to an intern just for the output, though.
An intern can be taught. If you try to 'teach' a craps table, they'll drag you out of the casino.
Drawing parallels between AI and interns just shows you're a misanthrope
You should value assigning tasks to human interns more than AI because they are human
As someone who has worked with interns for year, expect feedback and reiterations always, be surprised if they get it the first time... which merits feedback as well!
But looks like the intern mafia is bombarding you with downvotes.
It's only "gambling" for now...
The odds of success feel like gambling. 60%, or 40%, or worse. This is downstream of model quality.
Soon, 80%, 95%, 99%, 99.99%. Then, it won't be "gambling" anymore.
Have you ever heard of an extrapolation like that being incorrect?
So is human coding.
Life is full of variable reward schemes. Probably why we evolved to be so enamoured by them.
Sometimes I think we put the Carr before the horse. We gamble because evolution promotes that approach.
Yes I could go for the reliable option. But taking a punt is worth a shot if the cost is low.
The cost of AI is low.
What is a problem is people getting wrapped up in just one more pull of the slot machine handle.
I use AI often. But fairly often I simply bin its reponse and get to work on my own. A decent amount of the time I can work with the response given to make a decent result.
Sometimes, rarely, it gives me what I need right off the bat.
> The cost of AI is low
If we're only talking about money spent on prompting AI, maybe. The damage to online trust is massive imo. So is the damage done by looting the commons to build them.
Typical privatize the profits socialize the costs bullshit
I doubt gambling is in nature. Investments based on reason pay off. Evolution shapes for sensical moves.
Humans invented gambling as a rigged game that mimics what's in nature, perversed for profit.
The "natural" form of gambling is this.
You need to collect food, do you go to where you know there are berries (low value but high likelihood of finding), or scout off to find a herd of deer? (High value but low likelihood of finding).
Looking for deer wouldnt be walking off in a random direction. You check water holes, known clearings, known fields.
Each of these is an operation (walk to X and look), each has a low probability of meeting a deer.
This is a variable reward scheme.
The result is optmize foraging practices - you mostly hunt for deer then fall back to berries. In larger groups some will gather berries some will hunt.
Contrary to popular thought hunter and gatherer were not separate occupations.
Broadly speaking, gambling is just making decisions without knowing the future. It's everywhere.
I was just thinking about this. I was reading those tweets about the SV party were people were going home early to “check on their agents” or the “token anxiety” people are having over whether they are optimizing their agent usage. This is all giving me addiction vibes. Especially at the end of the day it seems like there is not much to show for it.
Addiction for the mere purpose of satisfying a compulsion, rather than to achieve a reward or physical "high."
Sometimes I feel that subsidising these packages (vs cost via API) is meant to make more and more people increasingly addicted
> But now either the AI can handle it or it can pretend to handle it. Frankly it's pretending both times, but often it's enough to get the result we need.
This has been how I think about it, too. The success rates are going up, but I still view the AI as an adversary that is trying to trick me into thinking it's being useful. Often the act is good enough to be actually useful, too.
The first anthropomorphization of AI which is actually useful.
It's not even an anthropomorphization, the reward function in RLHF-like scenarios is usually quite literally "did the user think the output was good"
> I divide my tasks into good for the soul and bad for it. Coding generally goes into good for the soul, even when I do it poorly.
Lmk how you feel when you're constantly build integrations with legacy software by hand.
Inductive reasoning of any kind (e.g. the scientific method) is gambling.
Yes, that's literally how LLM's work, they're probabilistic.
When a code doesn't compile, it doesn't kill anyone. But if a Waymo suddenly veers off the road, it creates a real threat. Waymos had to be safer than real human drivers for people to begin to trust them. Coding tools did not have to be better than humans for them to be adopted first. Its entirely possible for a human to make a catastrophic error. I imagine in the future, it will be more likely that a human makes such errors, just like its more likely that a human will make more errors driving a car.
My understanding is that waymo has gone on the record to say that they have human operators that remotely drive the vehicle in scenarios where their automated system is confused.
Which I assert is semantically equivalent to saying: Human drivers (even when operating at the diminished capacity of not even being present in the car) are less likely to make errors driving a car than AIs.
This is getting off topic but they did not say the remote humans drive the cars. The cars always drive themselves, the remote humans provide guidance when the car is not confident in any of the decisions it could make. The humans define a new route or tell the car it's ok to proceed forward
idk it works for me it build stuff that would have taken weeks in hours ymmv
Trying to decide whether to refinance now or not feels like gambling too. Yet it’s financially beneficial to make some bet.
Defining “Gambling” like isn’t really helpful.
If I said I had a machine where I put in "tokens", watch it spin, and either get nothing or something valuable (with which I get being largely chance), you'd presume it's some kind of slot machine. The important things IMO are the random chance of getting something and being able to keep retrying so rapidly.
You can't keep paying to play the "refinancing game" until you get a good rate (at least not like pulling the lever again and again, you have to wait a long time, you won't call the same bank again and again, and suddenly they have an amazing rate), it's a different experience and the psychology is different.
I have had very similar experiences. I am not a professional software developer, but have been a Linux sysadmin for over a decade, a web developer for much longer than that, and generally know enough to hack on other people’s projects to make them suit my own purposes.
When I have Claude create something from scratch, it all appears very competent, even impressive, and it usually will build/function successfully…on the surface. I have noticed on several occasions that Claude has effectively coded the aesthetics of what I want, but left the substance out. A feature will appear to have been implemented exactly as I asked, but when I dig into the details, it’s a lot of very brittle logic that will almost certainly become a problem in future.
This is why I refuse to release anything it makes for me. I know that it’s not good enough, that I won’t be able to properly maintain it, and that such a product would likely harm my reputation, sooner or later. What frightens me is there are a LOT of people who either don’t know enough to recognize this, or who simply don’t care and are looking for a quick buck. It’s already getting significantly more difficult to search for software projects without getting miles of slop. I don’t know how this will ultimately shake out, but if it’s this bad at the thing it’s supposedly good at, I can only imagine the kinds of military applications being leveraged right now…
few thoughts on this- it's not gambling if the most expected outcome actually occurs.
It also depends on what you're coding with;
- If you're coding with opus4.6, then it's not gambling for a while.
- If you'r coding with gemini3-flash, then yeah.
One thing I have noticed though is- you have to spend a lot of tokens to keep the error/hallucination rate low as your codebase increases in size. The math of this problem makes sense; as the code base has increased, there's physically more surface where something could go wrong. To avoid that you have to consistently and efficiently make the surface and all it's features visible to the model. If you have coded with a model for a week and it has produced some code, the model is not more intelligent after that week- it still has the same layers and parameters, so keeping the context relevant is a moving target as the codebase increases (and that's why it probably feels like gambling to some people).
> it's not gambling if the most expected outcome actually occurs.
> you have to spend a lot of tokens to keep the error/hallucination rate low
Ironically, I find your comment more effective at convincing me AI coding is gambling than the original article. You're talking about it the exact same way that gamblers do about their games.
so your whole argument is that you are convinced that ai coding is gambling because according to you i am talking about it like gamblers talk about gambling?
- Was there anymore intelligence that you wanted to add to your argument?
lol that's interesting. care to explain why?
1 reply →
Like video gaming, but similar.
(Venture) capitalism is already gambling. AI is just a multiplier.
An idea just occurred to me: why not tell AI to code in Coq? AFAIK the selling point of that language is that if it compiles, then it's guaranteed to work. It's just that it's PITA to write code in Coq, but AI won't get annoyed and quit.
I think there are levels to this.
- One shot or "spray and pray" prompt only vibe coding: gambling.
- Spec driven TDD AI vibe coding: more akin to poker.
- Normal coding (maybe with tab auto complete): eating veggies/work.
Notably though gambling has the massive downside of losing your entire life and life savings. Being in the "vibe coding" bucket's worse case is being insufferable to your friends and family, wasting your time, and spending $200/month on a max plan.
You remind me of those guys who swear they have a "system" at the casino.
I'm not saying I have a system. I'm saying there are levels to this stuff. It's not a binary "gambling" or "not gambling".
haha.. I agree with the points mentioned in the article. Literally every model does this. It feels like this even with skills and other buzzword files
I really hate when people write about the AI of the past, opus 4.6 and gpt 5.4 [not as much imo, it's really boring and uncreative] have increased in capabilities so much that it's honestly mind numbing compared to what we had LESS than a year ago.
Opus specifically from 4.1 to 4.5 was such a major leap that some take it for granted, it went from getting stuck in loops, generally getting lost constantly, needing so so much attention to keep it going to being able to get a prompt, understand it from minimal context and produce what you wanted it to do. Opus 4.6 was a slight downgrade since it has issues with respecting what the user has to say.
See also https://www.fast.ai/posts/2026-01-28-dark-flow/
I mean, this completely falls apart when you're trying to do something "real". I am building a trading engine right now with Claude/Codex. I have not written a line of code myself. However I care deeply about making sure everything works well because it's my money on the line. I have to weight carefully the prospect of landing a change that I don't fully understand.
Sometimes I can get away with 3K LoC PRs, sometimes I take a really long time on a +80 -25 change. You have to be intellectually honest with yourself about where to spend your time.
As always, scope the changes to no larger than you can verify. AI changes the scale, but not the strategy.
Now you have more resources to test, reduce permissions scope, to build a test bench & procedure. All of the excuses you once had for not doing the job right are now gone.
You can write 10k + lines of test code in a few minutes. What is the gamble? The old world was a bigger gamble.
So.
Is.
Life.
You've discovered probability, there was an 80% change of that. Roll a dice and do not pass go.
Again. The output from llm is a probable solution, not right, not wrong.
it's gambling until you learn how to set up proper harnesses then it just becomes normal administration. It's no different than running a team, humans make mistakes too, that's why we have CI pipelines, automated testing etc... AI assisted coding "JUST" requires you to be extra good at that part of the job.
For me, the feedback loop accelerating the way that AI now permits is so addictive in my day-to-day flows. I've had a really hard time stepping away from work at a reasonable hour because I get dopamine hits seeing Claude build things so fast.
Addiction and recovery is part of my story, so I've done quite a bit of work around that part of my life. I don't gamble, but I can confidently say that using LLMs has been an incredible boost in my productivity while completely destroying my good habits around setting boundaries, not working until 2AM, etc.
In that sense, it feels very much like gambling.
"60% of the time, it works every time"
It is indeed gambling. You are spending more tokens hoping that the agent aligns with your desired output from your prompt. Sometimes it works, sometimes it doesn't.
Watching vibe gamblers hooked onto coding agents who can't solve fizz buzz in Rust are given promotional offers by Anthropic [0] for free token allowances that are the equivalent in the casino of free $20 bets or free spins at the casino to win until March 27, 2026.
The house (Anthropic) always wins.
[0] https://support.claude.com/en/articles/14063676-claude-march...
coding with an LLM works if the model you are following is: you have the role of architect and/or senior developer, and you have the smartest junior programmer in the world working for you. You watch everything it does, check its conclusions, challenge it, call it out on things it didnt get quite right
it's really extremely similar to working with a junior programmer
so in this post, where does this go wrong?
> I am not your average developer. I’ve never worked on large teams and I’ve barely started a project from scratch. The internet is filled with code and ideas, most of it freely available for you to fork and change.
Because this describes a cut-and-paster, not a software architect. Hence the LLM is a gambling machine for someone like this since they lack the wisdom to really know how to do things.
There's of course a huge issue which is that how are we going to get more senior/architect programmers in the pipeline if everyone junior is also doing everything with LLMs now. I can't answer that and this might be the asteroid that wipes out the dinosaurs....but in the meantime, if you DO know how to write from scratch and have some experience managing teams of programmers, the LLMs are super useful.
> it's really extremely similar to working with a junior programmer
Right, which is why LLMs aren't useful if you actually know what you're doing. It's a drain on your time to have to carefully check everything a junior writes, but you do it because he will learn and eventually return on that investment. With an LLM, there is no such long term payoff.
Is using a calculator gambling?
...and the payouts are fantastic.
“hiring people is gambling”
[dead]
[dead]
h1b coding is ignorance.
This "slot machine" metaphor is played out. If you're just entering a coin's worth of information and nudging it over and over in the hopes of getting something good, that's a you problem, not a Claude problem.
If, on the other hand, you treat it like a hyper-competent collaborator, and follow good project management and development practices, you're golden.
> If, on the other hand, you treat it like a hyper-competent collaborator, and follow good project management and development practices, you're golden.
I am consistently using 100% of my weekly $200 max plan. I know how this thing works, I know how to get value out of it, and I wish what you said were true.
If you do all of these things? You are in a better spot. You are in a far better spot than if you hadn't! Setting up hooks to ensure notes get written? Massive win! Red-green TDD? Yes, please! But in terms of just ... well, being able to rely on the damn thing?
https://github.com/ctoth/claude-failures
[dead]
_hyper-competent collaborator who may completely make things up occasionally and will sometimes give different answers to the same question*_
So, indistinguishable from a human then
1 reply →
Life is full of variable reward schemes. Probably why we evolved to be so enamoured by them.
In a healthy environment. We are harmed more by being totally risk adverse. Than by accepting risk as part of life and work.
I see whole teams pushed by c- level going full in with spec driven + tdd development. The devs hate it because they are literally forbidden to touch a single line if code. but the results speak for themselves, it just works and the pressure has shifted to the product people to keep up. The whole tooling to enable this had to be worked out first. All Cursor and extreme use of a tool called Speckit, connected to Notion to pump documentation and Jira.
> but the results speak for themselves, it just works
The results do speak for themselves, but it doesn't work.
> literally forbidden to touch a single line if code.
That is extremely stupid. What does that ban get you? I reqct to this because a friend mentioned exactly this. And I was dumbfounded.
It seems like just a CxO dick measuring exercise.
CEO1: "We allow our engineers to use AI for all work."
CEO2: "Oh yea? We mandate our engineers use AI for at least N% of their work!"
CEO3: "You think that's good? We mandate our engineers use AI for all code!!"
CEO4: "Pfff, amateurs. We don't even allow our engineers to open source code editors or even look at the LLM output..."
> That is extremely stupid. What does that ban get you?
confidence in firing coders I presume..
> But this doesn't really resemble coding. An act that requires a lot of thinking and writing long detailed code.
Does it? It did in the past. Now it doesn't. Maybe "add a button to display a colour selector" really is the canonical way to code that feature, and the 100+ lines of generated code are just a machine language artifact like binary.
> But it robs me of the part that’s best for the soul. Figuring out how this works for me, finding the clever fix or conversion and getting it working. My job went from connecting these two things being the hard and reward part, to just mopping up how poorly they’ve been connected.
Skill issue. Two nights ago, I used Claude to write an iOS app to convert Live Photos into gifs. No other app does it well. I'm going to publish it as my first app. I wouldn't have bothered to do it without AI, and my soul feels a lot better with it.