Comment by dkobia
6 hours ago
Zitron is begging for a collapse at this point. Yes, his macro analysis correctly identifies a massive financial risk but his incessant pessimism completely misses the incredible ground-level utility that many of us on HN celebrate every day through undeniable, massive productivity gains.
At this point I'm trying to believe there's a middle ground where the level of individual capability this unlocks, leads to major discoveries.
> undeniable, massive productivity gains.
Take any stock index, remove AI stocks, what do you see? That's right! Nothing...
So where is all the productivity going? Where is the value? Where are the massive unemployment stats or the millions of new startups making big $$$?
Writing about AI, destroying the planet for data centers, there's a lot of money to be made.
That being said, AI seems kind of miraculous sometimes.
Similar to cars. So enticing that we make everything else in the world worse in order to maximize the profit, make it indispensable, subsidize it, and make the dependency on it irreversible.
And it's not even something to blame individual people for.
Driving away from all the other cars to spend a weekend feels like freedom.
Using AI to answer a question feels like a "bicycle for the mind".
But in fact it's more like a car. It requires massive resources and creates perverse incentives, and the result is ineffective and corrupt.
Both cars and AI are amazing technology and extremely useful, but using them is not an individual responsibility. It requires societal subsidy.
The environmental impact of answering a question on an obscure topic with ai model is less than an the impact of answering the question with an hour-long google search hunting for references or a drive to the public library.
9 replies →
Vonnegut said in his last living work that the greatest addiction modern people face is the drug of cheap oil.
We got addicted to the convenience and overuse, and have started a mass extinction event because of it.
The perverse incentives will come for us all.
1 reply →
I agree with your message but not sure about the conclusion. Cars themselves are commodified luxury available (in the US pretty much required) to everyone, and they do need to be subsidized, both in terms of infrastructure and the lifestyle they require.
But with AI what is the exact price? My understanding is that R&D is extremely expensive, but running non-SOTA models is not that bad. We are getting pretty close to models which can be useful locally in many applications.
Or do you mean that at scale running them locally is not possible and hence the infrastructure price is in data centers, which will be expensive to maintain and scale for demand?
1 reply →
The original point of the stock market was to fund gigantic society-level projects (like railroads). Modern VC has replaced some of that at smaller scales but not all of it at the largest scales. So this could just be the stock market performing the function it was designed to perform -- helping fund something transformative on a societal level.
> Take any stock index, remove AI stocks, what do you see? That's right! Nothing...
Where did all the stock gains go before AI?
FAANG / MAG-7.
Was everything from 2012-2020 fake, too?
They went from ~9% of the sp500 to ~35% over your timeframe...
Not sure what your point is. Stock markets are based on money going into securities based on estimated future value. Even if AI were doubling productivity at a non-AI company, there is more leverage to that money going into an AI company.
The question is, is AI leading to massive productivity gains in companies that implement it? AI productivity gains take time to diffuse, but so far companies in the S&P 500 are seeing very high growth. YOY earnings growth rate for the S&P 500 is 21.7% https://advantage.factset.com/hubfs/Website/Resources%20Sect...
> YOY earnings growth rate for the S&P 500 is 21.7%
Now remove the companies selling the AI shovels: https://pbs.twimg.com/media/HIAjbZxacAARHwD.png
> Not sure what your point is.
My point is that they're selling us Skynet and the end of employment as we now it, things that we shouldn't even have to measure to perceive the results of, yet no one is able to measure any of it
Pointing a finger at nvidia, google, and the other few companies stuck in circular investment schemes that shouldn't even be legal and saying "OOGA BOOGA line go UP, UP GOOD!" doesn't count in my book
4 replies →
He has also consistently demonstrated, at least to me, that he doesn't really understand how inference works from a technical perspective, which weakens much of his core thesis for why there should be a collapse.
I do value having some naysayers in the mix generally, because we do need balanced critique in what is otherwise a very frothy hype cycle. I just don't think he's making sound arguments, and that's even assuming you even agree with his premises in the first place.
My biggest gripe with his napkin math is that he treats inference gross margins as something novel that you can't compare to normal SaaS margins. He's right in part: the constant carousel of R&D costs from model training, related infrastructure buildout, and other adjacent costs required to stay competitive do change the analysis a bit.
But he takes this way too far when he says this is structurally different from normal SaaS margins. The business model definitely doesn't look like Dropbox, but it absolutely looks a lot like AWS, especially early AWS, CDNs, telecom, etc. I can speak to the telecom bit personally, since it's been over half of my professional career as an engineer and, in this specific case, also as a founder. You can have a brutally capital-intensive infra business where profitability depends on utilization, oversubscription, peak-capacity planning, segmentation, and recovering capex over time.
The math he presents gets even more questionable as we see explicit segmentation happening for cost-saving reasons. Many forward-thinking orgs are waking up to the fact that they don't need to use the best, most expensive model for every task. They can route easier tasks to cheaper models, use caching, batch non-urgent workloads, and reserve frontier models for the subset of work that actually needs frontier intelligence. That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.
I think he doesn't need to understand the technology to point out the books are cooked. a business can sink in either way: the technology flops or the finances flop. he's arguing the /finances/ would flop. he doesn't argue that the /technology/ would flop, only that they can't come up with the money to pay their debters.
There is a piece of this I agree with. That you do not need to be a deep technical expert to notice that a company is burning cash by overcommitting to capex, or relying on heroic revenue projections that may or may not come to pass.
But that is not the full argument he is making. If the claim is that the labs will not be able to pay their creditors because inference is structurally incapable of becoming profitable, then he absolutely needs to be right about the technical economics of inference.
One part of that is the balance-sheet argument (which already shows insanely good margins). But it also depends on how inference-time compute actually works: routing, batching, kv cache reuse, model segmentation, different latency tiers, etc. Much of those details he's just been straight up wrong about in his writing, so as a result I have to call into question the rest of his reasoning as well (in part to avoid Gell-Mann amnesia).
> That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.
But does it also not mean that they will make less money given that there is already brutal competition for that lower tier from openrouter, Deepseek, Amazon, etc.?
You can't on the one hand say "customers are beginning to understand they can spend less" and on the other hand suggest that this is good for forecasts of revenue.
> that he doesn't really understand how inference works from a technical perspective
Could you share what tells about it? I.e. where he was wrong about it?
There's examples both in his writing and also in his appearances on podcasts, interviews, etc.
I'll cherry pick a couple:
“When these new models ‘reason,’ they break a user’s input and break into component parts, then run inference on each one of those parts.” [1]
This is not at all how test-time compute works. At best, this is a very loose metaphor that he may have used out of convenience. This might sound a bit pedantic to point out, but this is a very basic thing that he's getting wrong (presumably at least, again it could be that he just used a poor metaphor).
A less pedantic example would be his claims related to gpt-5/chatgpt auto-routing. He argued that having a router means OpenAI can no longer cache static prompts, because the user prompt has to come before the hidden instructions [2]. This is just not at all how this works at inference-time. There is no evidence that the standard approach of system>developer>user instruction hierarchy has changed, the public API and caching docs maintain this.
But even more broadly, it suggests he is reasoning about kv/prefix caching at the wrong level of abstraction. It's true that conventional prefix caching does require a stable prefix, so yes, if you literally put variable user content before the static prompt, you would destroy the cacheability of that static prompt.
But that is exactly why inference systems are designed to preserve reusable prefixes where possible (via checkpointing or similar), and why serving systems care so much about prefix caching. This is also a big part of how disaggregated prefill/decode infra works where cache-aware routing is critical. His argument treats a bad prompt layout as if it were a necessary consequence of routing, rather than an avoidable implementation choice.
A router can read the user request, decide which model path to use, and then construct a normal downstream model call with stable static instructions first and user content later. Treating that as impossible implies a fundamental architectural misunderstanding.
[1] https://www.wheresyoured.at/how-to-argue-with-an-ai-booster/
[2] https://www.wheresyoured.at/how-does-gpt-5-work/
Productivity is not value. It's quite possible for you to experience productivity improvements, and actual value to not be created. That is what I think the most robust data is showing.
https://unessays.substack.com/p/talk-is-cheap
From an economic perspective productivity is defined as the creation of value isn't it? Then if you "improve productivity" and does not create value in the end you're no improving productivity at all.
It does depend on how you define productivity. But the way it's commonly used is "I'm going faster, personally, with these tools."
The thing people I think have a hard time seeing is that "I go faster" does not mean "more features get finished".
It's a scale issue, and one scale is better than the other. People only pay for finished features, they do not pay for how much code you emit.
3 replies →
Productivity is defined revenue per worker hour. And we know worker hours are going down as there are fewer workers with the layoffs.
Also, supposed productivity gains are dubious. I personally experience at best no productivity gains when using LLMs to write code, and sometimes it's an active drain on my productivity. There was that one study a year or so ago showing similar results. People are trying to say the productivity gains are there and undeniable, but that is not true. It is very much a subject of controversy whether AI helps productivity.
I can see an argument that the productivity gains are illusory / don’t translate to economic productivity. I’m not denying the possibility.
However, most of the engineers I respect have gone from being skeptics a year ago to convinced today. I don’t personally know any true holdouts any more. If there are studies that disprove productivity gains more than six months ago, I’m happy to believe that it was true of the AIs that were available at the time. But I’m going to need something much more recent before I disbelieve my lyin’ eyes where it pertains to the AIs available today.
6 replies →
That's possible, sure. But I think the answer is more likely in the numbers, not in just qualitatively saying AI isn't worth anything. Like if I pay $30k for an ounce of gold, I got value. Gold is worth something. But that amount of gold wasn't worth what I spent.
EDIT: In fact, parent comment has a link to some numbers.
[EDIT: Most] people don't want to go through the numbers. Ok. But there's a history here. When people don't want to see the numbers, certain kinds of things tend to happen.
I've posted numbers that indicate that productivity is becoming decoupled from value delivery. If you follow the link in my comment it reviews a pretty robust study of 4000 teams over 2 years. There is no product throughput increase.
7 replies →
>undeniable, massive productivity gains.
How can something so undeniable have zero scientific evidence? Are there any large peer reviewed or meta studies confirming your claim?
It’s a very hard experiment to run. You have a population that’s already “treated”. You can’t blind them to the fact that they’re using AI tools. It’s hard to imagine a study that wouldn’t have serious flaws that people would then use to dismiss and form their own conclusions. Sure you have METR but that was very low n with a very old model.
I think the surest sign of productivity gains is the sheer volume of adoption. If you look beyond headlines, adoption is just incredible. Of course adoption does not necessarily point to productivity gains, but if this was some sort of FOMO or smoke and mirrors you would not see this much retention and this feverish a pace of adoption. You would not see a large segment of the profession using coding agents exclusively. All of these companies track productivity, again with imperfect proxies, yet everything points to a pretty consistent picture. Same with benchmarks, again a lot of crappy benchmarks but a lot of high quality ones too and a very diverse collection of tasks and capabilities they probe.
Your second paragraph appears to be 3 different instances of saying "X does not necessarily point to productivity gains... but in the case of AI, X definitely means productivity" without really saying why that is true or why other explanations do not fit.
Adoption meaning productivity supposes there are no other dominant factors for the AI push nor AI retention. It is possible for practices to be picked up or continued in spite of causing productivity DROPS. What studies have suggested are factors that make for productive work environments and what is actually enforced in the workplace are different things.
1 reply →
Because even in a field like software engineering where the output of our work is save in version control, measuring baseline productivity is hard.
LoC: people argue it’s not what’s important
PRs/day: same as LoC
Getting projects done faster: oh but what about the quality.
Solve the technical problems and actually be more productive, the social systems build around the old way of doing things will hole you back.
Finish a PR in 10 minutes doesn’t matter if you’re waiting days for a human review.
He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.
The fact he’s never reflected on the glaring failures in his analysis tells what we need to know about his intellectual integrity. There’s truth in some of his words about financial risk, but if you can’t acknowledge that there’s upside too, you can’t evaluate risk properly either.
I find it difficult to take him seriously.
Progress is slowing, in an important way.
Have a muck about with what Qwen 3.6 or Gemma 4 can do and you'll see. I mean this as an illustration but Qwen just isn't as far behind as I expected, and compared to the data centre hardware it will run on a potato.
The frontier models are losing their undeniable edge over that which is unmetered.
And even putting aside my optimism for the smaller open weights models, there's a huge amount of scope for the larger, hosted open weights models that are only just behind the cutting edge and which cost, what, 1/25th of the price on opencode go, openrouter etc.
Commodification is coming, and with it slimmer profit margins; it's hard to see them making anywhere near the kind of money they need to in a commodified market.
> progress was slowing
Do you think it's not slowing? Do I miss anything really important?
My understanding is that we have now is incremental improvement on thinking models which appeared more than a year ago. Of course, a breakthrough might happen, but I don't see one yet.
The most important thing I would point to is Mythos et al and the wave of vulnerabilities that have been discovered in the past couple months. It’s a completely unprecedented event, brought forth almost entirely by improvements in the models themselves. That said. keep in mind, I’m talking about over the past two years. With Claude code and the capabilities gained since December of last year, there have been incredible gains in the capabilities that are now available. Demand for inference is higher now than it was a year ago, because capability has improved. A specific criticism that I would hold is that claiming that progress with LLMs is slowing, prior to that point, is embarrassingly wrong in my view. One could argue that the model capability improvements are slowing, and all the improvements were in harnesses. I think that’s a stronger argument, but I have a few problems with it. 1. Utility is utility. Whether that comes from the model or the harness is irrelevant when making claims about utility. I don’t think that’s a useful distinction most of the time, but especially when talking about the technology as a whole. 2. Marginal intelligence gain is different than marginal utility gain. It’s estimated that intelligence grows logarithmically relative to investment. However, the utility of a marginally more intelligent model may grow exponentially, because once behavior crosses a reliability threshold, it unlocks new capabilities. 3. Even on those terms, it’s not clear to me that frontier capabilities are slowing down. With Mythos and its contemporaries, we have been seeing a vast change in the security industry as vulnerabilities are discovered at an unprecedented rate. OpenBSD vulnerabilities, more Firefox vulnerabilities found in a single month than the past two years, critical Linux vulnerabilities. It’s hard for me to look at the effects there, a radical new capabilities baked into the model itself, and see stagnation. A part of the reason it might feel like it’s slowing down is because we plebs don’t have access to the top models.
6 replies →
> He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.
Old WSB saying: The market can remain irrational for (far) longer than one can remain solvent.
And unfortunately, a lot of the market on the "buyer" side has been acting irrationally. When you see CEOs telling their employees that they don't care about token cost, only about "how much AI do you use" because that is what the stock market wants to hear - that's when you know we're all getting cooked, the question is how long it takes until the bubble bursts.
anyone that takes him seriously at this point... I don't want to say very bad words here...
[dead]
I do not disagree with what you are saying, but I honestly still believe that most of the utility we experience are honestly gonna become very boring very soon that we can just run local... Even if it's a bit more slow who cares, can just run in background while you work on other stuff yourself, read up on things, review other work...
It's not that the utility of it put in question. What is however a giant question mark is how the heck any of the big AI companies are ever gonna get that ROI? Given how many of us are becoming more and more fine with local models that run just fine especially on a good enough computer which most developers have anyway...
Even more dangerous to the big 2 AI companies is the fact that the 20 different Chinese companies are catching up fast and for a lot lower cost.
Why should someone pick Opus 4.8 when Qwen3.7 Plus produces similar results for about 1/20th the cost.
That sort of pricing disparity is across the board. But further it's becoming more and more apparent that they are doing more with less parameters. That's what's giving the local models their super powers.
Because it doesn't. Not for the tasks where using Opus instead of a lower tier model is appropriate, at any rate. Benchmarks show this, as do revealed preferences of actual users. To believe that Qwen is as capable as Opus at 1/20 the cost you have to believe that every person who does not make the choice to use Qwen over Opus for a given task is some mix of ignorant or delusional. This is certainly an opinion you can hold about other engineers, but it's definitely a questionable one at best.
1 reply →
They are absolutely deniable. Huge swathes of people deny them.
He has recently made the very good point that actually, the FAANG companies are struggling to put any ROI numbers on that incredible ground-level utility.
Uber, for example, is so unclear there is any ROI, they are cutting their exposure pretty radically.
He points out that one single Anthropic customer — a payments provider — accidentally had to pay Anthropic $500M for one month of token spend.
That is half what Apple is reportedly paying Google for the supply side of their entire consumer AI strategy.
Even if we assume that everything you said holds true, how is that we as a crowd can make viable a service that eats some $300bn annually in infrastructure costs? Where would that money come from? Most tech companies these days are cutting their AI budgets because the per token pricing is killing them.
Cite a real source for that last bit, I don’t think that is true. Also the budgets should be cut the spend at some places goes beyond any reasonable amount. The strategy there is to hook everything in and find the right processes, then cut the rest. Things then get better and better with each model release.
The way you make a viable service that eats 300bn annually is to have enough demand to service that. Anthropic underbought compute. That tells you something.
https://www.theverge.com/tech/930447/microsoft-claude-code-d...
https://finance.yahoo.com/sectors/technology/articles/ai-bin...
https://blog.pragmaticengineer.com/the-pulse-token-spend-bre...
> undeniable, massive productivity gains.
The jury is still out on that.
Yeah they're very much deniable. Raw LOC/hr is much higher, and putting together a MVP, but I've yet to see any evidence that an LLM is capable of doing anything unsupervised, and if you need a human supervising everything it does... why bother having an LLM in the first place?
Because it can perform much faster? Monitoring allows you to multitask more effectively. I would also disagree that you can’t one shot anything…claims like this are weak and I have enough counter examples in my own life that it’s trivially false. The question is more: can it one shot the right things with a low enough failure rate for it to be a good replacement. It’s hard to figure that out a priori.
Agreed that he has an extreme POV (or more accurately that he trolls for views/subscriptions). But his central argument is valid: if AI underdelivers financially, this bubble will burst and this bubble is magnitudes larger than what we've seen before, so there could be very rough seas ahead.
The question is: what does "underdeliver" mean here? the pro-AI arguments I am seeing in this thread are equating mass adoption to agentic coding. Er, I dont know of any trillion dollar cap companies that sell dev tools. The point is Zitron doesn't have to be 100% right for his central prediction to come true.
I don’t get this. We already have an insane demand. And yes exactly, this is primarily just with coding agents, but are you aware of what’s coming down the pipeline? It’s not hard to be you just have to find a decent way to keep up with literature.
* robotics (need to close data gap and release first viable product to get a data flywheel)
* conversational ai (no one is ready for this and we’re getting closer and closer to natural speech. The quality still isn’t good enough but it’ll be soon).
* other agentic use cases, openclaw adoption was crazy and that had a ton of barriers to entry
* ai products, like the one OpenAI is working on with Johnny Ive
Anyone thinking it’s unreasonable to hit whatever revenue requirements is just not that aware of what’s happening. Not to mention were capacity constrained already!! This is barely speculation at this point.
I don't think the issue with robotics is a data gap. maybe somewhat, but the real issues are that:
- RL is extraordinarily sample-inefficient.
- distribution shift/catastrophic forgetting aren't solved. only off-policy learning with giant decorrelated batches works.
- the breakout success of transformers as an architecture doesn't neatly translate to robot motion policy models.
the field is missing fundamental breakthroughs.
I also find it very interesting that conversational AI has taken this long. where are the models with good turn-taking? passive listening? the ability not to respond in paragraphs? has Anthropic simply not gotten around to it?
1 reply →
I quite like my mechanical spider from Wild Wild West and the coffee it makes with a 50% success rate
Every day people here debate whether or not there are any actual productivity gains from LLM, and it's only in the limited context of software development. While I understand that this place obviously skews heavily towards the software industry, the notion that LLMs are anywhere near as useful in other industries is hubristic (at best).
Perhaps they aren't, but not currently viable !== always unviable.
Is it really worth it to cause a global economical collapse and harm society well-being to an unimaginable degree just to find out if it is viable?
Why cant it naturally grow and prove it's worth?
Just 5 more years and $500 billion more, bro. We're still so early.
And?
> through undeniable, massive productivity gains.
And where are those? They seem particularly hard to actually observe and only appear in anecdotes.
> I'm trying to believe
For every exponential increase in compute capacity you see linear gains in output accuracy. This is a death spiral. Anyways, you see "massive productivity gains" so why is "belief" a function of your viewpoint?
I really like some good drama slop that reads like a thriller, it is entertaining. I don't take any of it THAT serious, but lately with the IPOs that are about to hit the indizes, he has gained a lot of attention. If you look around the internet, most people publish a negative angle on something and then extrapolate it into some grand conspiracy, which is really captivating. Its crazy when you enter some echo chamber you never engage with (movies, gaming, art/comics) and they have their own head cannon for why the world is bad and collapsing. It puts your echo chamber into perspective to see the same patterns of argumentation and presentation spin out in a different way
> undeniable, massive productivity gains.
Just because you keep repeating something doesn't make it an undeniable truth.
Yes. Zitron has been predicting and begging for collapse since 2024. It's not just his brand at this point. It's his entire identity. As such, he cannot back down, he cannot question himself, and he cannot accept any other viewpoint. And he will keep moving his goal posts until something happens that can make him go "aha! I told you guys!!"
This, combined with his extreme ignorance, makes him unreadable. The only reason people read his stuff is because it validates and confirms their own anti-AI beliefs. It's why every time he publishes an article, it reaches the front page in an hour or less.
> This, combined with his extreme ignorance,
Extreme ignorance?
> undeniable, massive productivity gains
How are they undeniable? They're very deniable. One example is the (seemingly) increasing maintenance costs for AI-generated code[1]. Another is the cost incurred by everybody reading AI slop instead of actual communication.
I don't have hard data as to whether these cancel out the benefits, but it's not as rosy as some seem to think.
[1] After years of people understanding that LOC is not only a poor productivity metric but also a negative indicator of code quality (shorter code for the same thing is better), we now have people touting how many LOC their LLM agent is generating. It's like everyone forgot what LOC actually represents and what it means for long term maintenance costs.
> Zitron is begging for a collapse at this point
No, he's not, he's making tons of money every month from his Substack subscriptions. In fact, the AI bubble popping would be the worse thing ever for him, he would be out of a job.
Just like the who have predicated the US dollar will collapse any-moment-now and which pushed gold for decades.
Funny how people always say "oh, you are an AI lab, of course you are going to hype AI", but never "oh, you make sooo much money from predicting the collapse of the AI bubble..."
[dead]
[dead]
[flagged]
i don't think this comment contributes much to the discussion. can you elaborate more than saying "no"?
[flagged]
[flagged]