Comment by trjordan

1 day ago

They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

This means we're going to need $1t+ per year in spending, per year, on tokens. 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

Here are a few thoughts:

- The publicly available information about how inference costs compare to training costs is conflicted. EEs involved in datacenters talk about power usage spikes during training runs as if they were a major factor in the designs, but academic papers discussing cost-optimal scaling confidently treat inference-time compute as a major factor.

- On the side of the balance indicating that training is more compute-intensive after amortization than inference is that Chinese providers, constrained primarily by access to compute, have nearly unlimited token availability at a lower price than US providers (inference), but poorer model capabilities (training). That would make sense only if US providers are inflating inference costs by 20-30x due to amortized training costs that overseas providers were not able to take on (there are other factors too).

- If training >> inference, they're in a prisoner's dilemma that far exceeds the ordinary zero-marginals model of competition between firms (due to its huge discrete stepwise nature). On the other hand, if inference>>training, the high-level analysis popularized by certain thought leaders, that it's like a utility, would be true. You'd tend to count this as a vote for inference>>training, but the CEOs saying it at least have a huge incentive to agree because the alternative, the prisoner's dilemma, would stop investment very fast.

- The only voice in the story that I just told you to have anything to do with fact (as opposed to high-level analysis and ivory tower armchair management of a secretive business) were the rumors from facilities engineers. That shows you the state of our understanding...

- If we don't even know the ratio between amortized capital expenses and operational costs, outside investor analysis is impossible. It doesn't matter how finely they divide the accounting buckets for office ferns and indoor ferns if the single biggest part of their business is obscured for trade secret reasons.

  • I'm about to leave a shallow comment, but I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop? So the fact that publicly available information is conflicted is probably a sign that at the very least, the numbers aren't amazing.

    Yes I know there's no evidence and this is lazy reasoning. But there's probably a bit of truth to this line of thought.

    • Why on earth would AI labs be bragging about how little the product they sell actually costs them to make? You don't want to do anything that reduces it's perceived value to the user, that might make them less willing to pay for it.

      Also, inference costs are bound to go way down with more optimized architectures. GPUs are fundamentally not great at inference. No platform where the weights are streamed from a large pool of memory is. If the models ever quiet down, there will be massive step changes in cost/token, energy/token and tokens/second, as models are etched into silicon ala https://chatjimmy.ai/

      50 replies →

    • Inference has traditionally been far less expensive than training. One public example is the fact that hobbyists can run StableDiffusion ($600k training costs[1]) on their personal computers.

      Speaking to your point, inference being dramatically less costly than training would not be seen as a delta from the norm. The model of providing inference for anything near the operational costs (like a utility would), would the delta from the norm if it were true.

      [1] https://x.com/emostaque/status/1563870674111832066

      9 replies →

    • For equal capability tokens, there has been about a 10x drop in cost every 6 months.

      We are still chasing the best because the best is moving rapidly, but it’s a simple thought experiment to work out what the cost to serve an 8B model from 2 years ago is in a world of 2T models.

      Note: parameter counts are illustrative. Concretely, qwen3.6 27B delivers opus 4.5 capability at 1/27th the cost on openrouter. Single chip llama3 8b performance can exceed 17k tokens/sec.

      4 replies →

    • > I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

      Unless to the grandparent commenter’s point they’re using it to obscure their large prisoner’s dilemma (training) cost?

    • > If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

      Google seems to pretty regularly post about how their TPU and algorithm advancements have been decreasing energy costs for both inference and training.

  • Small alternative potential future changes that alter this analysis:

    * At some point model capability reaches diminishing returns. Then inference >> training in the future but training >> inference now. It’s not a prisoner’s dilemma but a land grab to solidify market position and be one of the 2-3 firms left standing as dominant in the space. The model companies aren’t super sticky yet but they’re working on it.

    * even if training remains >> inference, it’s possible to have multiple price points like they do today. If you need the most capable model you’ll be paying exponentially more per token to supplement the training cost even though the serving cost is marginal because most people will be satisfied with cheaper / less capable models for most tasks.

    I buy that inference is a dropping line item while training is a growing one. There’s all sorts of things on the horizon that’ll be order of magnitudes improvements, from startups burning models into ASICs to get order of magnitudes more performance to alternate architectures like diffusion transformers that have orders of magnitude structural optimizations. It’s inevitable that it’ll come down even further from where we are. It’s possible model training also will go down but I’ve not seen any compelling research suggesting major “easy” reductions here.

    • The issue is that most tasks do not require frontier-level intelligence, but companies like OAI can really only profit off of the frontier. Capabilities from a year or two ago are so outdated that even OpenAI gives it away for free and there are many other models biting at their heels. In other words they are spending huge amounts of money to cash in on a depreciating asset.

      So one possible future is that frontier-level training becomes so expensive and the use cases so sparse that it simply isn’t viable to keep going bigger.

  • We have GPU costs, power costs, and how many token/s models can generate on those GPUs. It’s possible to figure out the marginal cost based on this. The current estimate is about $0.40 per million tokens for gpt4 equivalent model. Sonnet 4 is $15 per million tokens, so they are charging high margins on inference. The issue is how large of a margin is needed to recover their costs before the GPUs age out, and how high of a margin can be charged before it’s not economically viable.

    https://www.gpunex.com/blog/ai-inference-economics-2026/

    • That seems way off to me.

      I skimmed the article, but couldn’t spot any details on their estimates. They mention 70b+ params as being large in several places. But we’ve had several 100b+ param models that trail Sonnet.

  • Why would power spikes from training runs imply training>>inference? The cost of a training run scales with energy, whereas power is energy per unit time. All that tells you is that they're speeding up their training run so it will take less time overall (probably chasing some first-mover advantage, where they're out with a given model before their closest competitors), whereas they obviously can't do that for inference (which is a steady flow of requests over time).

  • I don't see how it would be possible for inference costs to dominate training costs, even after amortization.

    Training involves multiple passes over the entire training dataset, ideally in large batches where you can perform inference on as many samples as possible simultaneously and then perform backpropagation to adjust the model weights (which is about as expensive as inference).

    Let's consider the size of the dataset we're dealing with here. The dataset likely consists of practically every piece of digitized text they can get their hands on (including that extracted from audio and video). We know Google has digitized a large portion of the books in existence as part of their "search book contents" feature and we have no reason to believe they're not using it alongside their cache of 90+% of the internet to train their models. We're talking about 100s of millions of books each with an average of 100,000s of tokens. The internet has 10s to 100s of billions of pages on it with who knows how many tokens on average. This is a huge dataset that we've got to go through hundreds of times.

    Second, let's consider the effect of batching and how it sets requirements for our hardware. We know that larger batch sizes converge faster, are more stable, and produce better models. So if you want a good model you need large batch sizes. This means that you need machines several orders of magnitude more powerful than you use for inference. From what I heard Google uses clusters of 100s of the their TPUs all located in a single rack for training. These clusters are organized in a customized computing architecture to maximize memory locality between cores (really critical for efficient back-propagation). Further, you can't use reduced precision weights for training like you can for inference, so there are no shortcuts.

    Finally, the initial training stage is followed by reinforcement learning stages - this is key development in how AI models have improved in the past year. This may mean going through a curated set of traces (either synthetic or captured from users) and adjusting the weights based on experienced outcome.

    Overall there's so many orders of magnitude more work and more hardware requirements for training that I find it improbable that inference dominates. The number of "inference" steps in training is freaking ridiculous and includes such factors as the "number of words ever written".

    • It's been a while since I saw a detailed paper on a high end training run, but extrapolating from what I remember, it seems those training runs are in the 10s of trillions of tokens. This already accounts for potentially sampling tokens multiple times during the training run.

      That seems like a large number, until you realize that OpenAI claims to have almost a billion weekly users. And OpenRouter shows many models at over a trillion tokens per week.

      So in pure token terms, I'd say it is in fact extremely plausible that inference dominates, at least for the popular models.

    • Not saying you're wrong, but I'll note why inference might dominate despite everything you mentioned.

      A given model is trained once but applied N times. A large enough N will dominate training, no matter how complex and costly it was.

      But how long is a model useful for? How often will labs need to train new models? Time will tell.

  • > If we don't even know the ratio between amortized capital expenses and operational costs, outside investor analysis is impossible.

    And yet we surely need this data for the IPO? Or are they relying on rule changes on the indexes to force ETFs to buy shares?

  • Yes the huge discrete stepwise training spend is critical.

    Maybe investors will realise that "the only winning move is not to play".

    And so we are left with (as was) frontier models getting more and more out of date as whoever their post bankruptcy custodians are tries to eek pennies on the dollar for inference on their decaying property. Perhaps along with local and/or highly specialized models still feeding on the after-glow of the huge amount of training that was (and is no longer) done.

    The next AI winter is going to be deep, savage, and long.

    • > frontier models getting more and more out of date

      Why are they getting out of date? Is it because we have new content from the internet that the older models did not have? Or are we simply trying to increase the size of the training data? In other words not more up-todate in terms of time the content was created vs. wanting to use bigger training-input-sets?

      1 reply →

I work for a tiny little company ($150MM annual rev with 9% net) and we are already looking at dropping $100k on hardware to run local models because, for us, they're "good enough."

Our estimated spend for AIaaS would exceed that cost in less than a year.

In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

  • Yeah, that's the part that just seems to be wildly under-discussed to me.

    If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?

    AI cost ballooning faster than companies can afford is becoming a very common topic in my circles right now. The era of "I'll pay infinitely more for marginal gains" is over from what I can tell.

    • > If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?

      They know they do not and that’s why they’re all trying to IPO right now, so they can pass the bag to consumer investors

      1 reply →

    • Open source models that you can run locally are much more than 3 to 6 months behind. 6 months was the November inflection for Claude. No open source model is as good as Claude Opus 4.6.

      57 replies →

    • There's still a lot of room for the best models to get better at coding .

      Your argument rests on the "for marginal gains" part but it's really not clear that the gains are marginal in the foreseeable future.

      7 replies →

    • Open source models, especially qwen are pretty dang good. But its not opus 4.6, the evals dont tell the full story. I question the assumption open source models are 3-6 months out.

      4 replies →

    • You have to think about why open models are behind. Exfiltration is a big part of it. So you could change the Nash equilibrium by increasing your security, or other multilateral approaches.

  • > ...we are already looking at dropping $100k on hardware to run local models...

    Just think how much further that $100K would have gone if the hardware market wasn't so screwed-up.

    Anecdote: I priced-out adding 1TB of RAM to a four node cluster a couple months ago. The cluster was purchased in fall of 2024 w/ 4 nodes, each with 256GB RAM. The nodes cost just over $14K apiece back in 2024 (entire box, not just the RAM).

    Dell wanted >$90K a couple months ago to add 256GB to each node.

    • > Dell wanted >$90K a couple months ago to add 256GB to each node.

      RAM is expensive, but not THAT expensive. I just bought 128Gb for about $5k for our build cluster (it's not even for AI, sigh). Even if you need larger-sized DIMM sticks, it's still going to be in the vicinity of ~15k tops.

      2 replies →

  • I get the impression the hive mind hasn't come to terms with the point that a model is optimised for certain tasks. It's like having someone ask you "is that a good hammer?". Good for what? There are claw hammers, sledgehammers, ball-peen hammers, club hammers, mallets, .... Yes, in a pinch, they can all bang in nails, but you wouldn't choose a dead blow hammer for that if you had a choice.

    The Gemini Flash is very good at searches. Just about any low end model can toss out a poem. All the higher end models (open source and otherwise) seem to be able to churn out code that passes tests. The smaller, "less capable" ones are much faster at it, which means in the hands of a skilled practitioner are the best choice for that task. But they rapidly fall apart where there isn't a hard source of truth (like a good test suite) to grind against. Because of that you have to use a bigger model for bug finding. In that task the open source models tend to fail on larger code bases, where something like Opus still shines. I gather Mythos is an absolute monster, and unparalleled, and unavailable. I'm sure one of the reasons for that is it's so expensive to run.

    Or to put it another way - you don't use a 100 tonne crane to pick up the shopping. And ... the smaller models will happily run on in-house hardware. You may not do it today because of the current DRAM price and integrated NPUs have just started shipping, but in 5 years time models will be running on your phone.

    • Yes exactly, we will have specialized models soon. These will be trained with plugin architecture with a core reasoning model asking plugin models to do stuff on its behalf. I don't need chinese or russian knowledge in my workflow.

  • Yes 100% this. A lot of people keep talking about how OpenAI and Anthropic will need to raise their prices. What is less discussed is how they CAN'T raise their prices because competition exists, and sure it's not SOTA, but it's literally an order of magnitude cheaper in many cases and the drive to figure out how to make it work well enough is going on right now (and will only intensify when the SOTA models raise their price).

    It's a given that the SOTA models need to raise their prices. It's also a given that they can't. The more they raise the more customers will move to their competition.

    So what happens next? Well I think it will suck horribly if you can't move off of SOTA sooner or later, because the Big Two are going to lose customers, and therefore have to raise prices on the locked in customers even more than these projections suggest.

    Beyond that if you're looking to start a business, figure out how to use cheap models in new scenarios. Build software which does that and license it. This is kind of contrary to the idea that you shouldn't over optimize for deficiencies in the models that will likely go away in the next generation - for instance a lot of problems were solved when context windows got way bigger. So it's a thin line to walk but I think it's there because a lot of orgs are using Claude today for pretty basic tasks.

    The dev who's addicted to SOTA models honestly is going to have to settle for less or get totally screwed. Most applications within business from what I see aside from complex research do not require SOTA. They summarize, they classify, they transform, and doing that accurately has been cheap for a while.

  • On prem AI makes sense for more than just the cost. More control, IP, model improvements you can keep, data privacy to name a few. People will realize that AI is not like compute the moment they get their own knowledge sold back at a premium.

    • > People will realize that AI is not like compute the moment they get their own knowledge sold back at a premium.

      But what if your competitors sell their knowledge to AI companies?

      Then you're still screwed.

    • What are the advantages to on-prem for a company that's already in the cloud and trusts it with their IP? That company can just rent GPU instances from the cloud if they want to train/fine-tune their own models and keep avoiding CapEx.

  • Agree. You have these tipping points when a model is good enough to do some task. Yes, a better model will further improve your capabilities but the unlock is at a certain intelligence level. We see this also with humans. People with very low intelligence can't learn to read. Once you cross a certain threshold of intelligence you can learn to read. More intelligence doesn't really help you in the task of reading. A person with an IQ of 160 is not substantially better in reading than someone with an IQ of 85. If your IQ is 50, you might not be able to learn to read at all.

  • I don't quite understand, what would 100K buy you?

    AFAIK you would get about ~5 concurrent users, with a max context window of ~128K tokens on the larger models.

    This wouldn't be good enough for coding -- are you guys thinking of using it for something else?

    • Gigabyte 4x AMD Instinct MI300A rack server (512GB GPU RAM total)

      Roughly equivalent to 4x H200's for less than half the price.

      Vaguely around 60k tokens per second...

    • By my calculations 100k could get you 18 5090's + compute to host them, or 18 96gb Mac mini's. You can get a lot of context window and users out of that setup.

  • Do you think this will be a trend for larger companies as well?

    The decadal move to all-cloud-all-the-time killed off in-house hardware teams while the C-suite chased their OpEx dreams.

    It would be interesting if we come full circle on this.

    • I doubt it. Companies that have moved to the cloud are already trusting the cloud with their IP. You can rent time on a high end Nvidia system from various clouds. OpEx means there's no write down in three/five years as that system goes out of date so it would only make sense if the performance/$ is there, or the company is highly protective of their IP and doesn't trust the cloud, at which point they're not on the cloud anyway.

  • I configured a dual DGX Spark cluster, and it's certainly "good enough" for my agentic and coding needs.

  • My much larger company has got people already using various models through Bedrock because the Claude and OpenAI limits are too harsh and it's too expensive.

  • It might be possible that in a few years someone will be able to engineer a reasonably priced machine to run today's frontier models (hint, your price is an order of magnitude off). However, they won't be able to run the frontier models that will exist in a few years.

  • I’m curious: are you spending on beefy developer machines, or some kind of shared local inference server? Would be interested to know more if it’s the latter.

    • I am aware of at least a handful of companies doing the latter. I don’t work for them and cannot speak to their setup.

  • > In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

    What makes you so confident about this prediction? Hardware costs haven't exactly been cratering recently.

    • .> Hardware costs haven't exactly been cratering recently.

      No, but local models have been booming in performance/quality improvements. The RAM shortage won't last forever (more supply will come online when if demand doesn't diminish), and then the math would be pretty easy.

  • same, but you need more then 100k of hw to run something like kimi k2.6 for a bigger team. on the other hand there is a ds4 flash that you can run on a macbook with 128gb ram. an that one is perfectly usable for a lot of tasks.

    https://github.com/antirez/ds4

  • What models? Last I tried different local modals there was a pretty big difference from frontier.

  • Eh, one question. Where do you intend to buy the hardware if datacenters take over the market?

  • That’s exactly where the market is heading and it’s going to have to reckon with this fact

    My guess is there’s gonna be some legislation or something “you can’t share anything over this level of complexity” and I think that that’s what a lot of that mythos rattling was all about

  • > there will be hardware capable of running frontier models

    The current frontier? Sure. The frontier then? No - obviously that frontier is going to keep consuming available datacenter compute capacity, which will be better

  • You people are delusional. How many times a day am I going to read this fiction of "good enough in a few years for most things".

    There are physical limits to how much you can compress data and how much is needed for a capable model. If by hardware capable for running SOTA you mean a 7 figure investment for a company, than sure. But how come these companies didnt do the same thing for cloud? There's been this option for self hosting infrastructure for a decade but companies don't use it, they pay AWS.

  • > In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

    I was going to say - the models are just going to keep growing at a pace exceeding the pace of hardware pricing/availability

    But then I realised that, far more likely, there will be a plateau reached (again) where nobody is seeing gain, and at that point hardware will catch up

> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

20-40% sounds about right for me, today. Maybe 40-60% on a good day. But a lot of the reason it's not higher comes from harness gaps and org processes that haven't caught up.

All of that will get fixed with time.

I was in college in the late 1990s/early 2000s and I distinctly remember an econometrics professor state the following:

"As cable TV and Pay Per View came out, there were studies done about how many movies people would watch if given unlimited access to films. The results were bandied about as proof that we should build out all this infrastructure to support this line of business. When the data was further analyzed by statisticians etc, it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible."

I feel like we are in a similar boat here where some people are assuming:

- EVERYONE is going to be using max tokens

- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

  • >I feel like we are in a similar boat here where some people are assuming: >- EVERYONE is going to be using max tokens >- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

    I feel like the reverse assumption is being made, that the current model looks like IBM doubling down on Mainframes soon to become cheap enough to deploy everywhere, when the real action is that the costs coming down represents cheaper hardware or more efficient software, and that a big chunk of "cheaper" AI will be eaten by smaller products deployed by individuals. Whatever the Personal Computer of AI looks like is going to be more disruptive than just an API endpoint you can fling tokens at.

    We already see this with things like chrome auto installing an LLM.

    You cant tell me with complete certainty that theres a moat here for the people spending 1 trillion + on this infra.

    >When the data was further analyzed by statisticians etc, it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible.

    I also think this applies to people suggesting that companies will sack workers for AI, when the costs of replacing everything someone does in a day is more expensive in terms of tokens (likely even at a reduced price) than just hiring a bloke.

  • > it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible."

    I realized it long ago: one needs output to make meaning. Input can only be the cherry on a cake in one's life. That, actually, makes FIRE or Fat FIRE not so sustainable unless one has other hobbies.

  • > it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible.

    And what happened? How many hours per day/week are people spending watching now?

    • What people: I'm sure some people are watching 10-12 hours per day - in places like nursing homes or hospitals. I know a reasonable number of people who watch a film nearly every day: 2-3 hours. Most people watch something every few days - often a tuesday night movie night for the family (or something like that). There are some who never watch anything. I don't know what the statistics are on this.

      My friends in day care tell me the kids hate "movie day" because movies are all the get at home and they are sick of them - they want to play all day. (but I'm not sure if this is representative of anything other than the types of people who put their kids in that particular daycare)

  • > they were going to watch films 10-12 hours a day, every day of the week. Impossible.

    A lot of these LLM demand scaling scenarios make broad "up and to the right" assumptions about things which in practice have finite limits. Only some percentage of knowledge work benefits from acceleration, optimization or other improvements, and even then the amount of economic gain is capped.

  • But isn't it wonderful that they did?

    • It's vaguely disturbing that people "watch" films 10-12 hours a day. Many of them are using it as a radio, for background noise, without really caring what the program is beyond vague genre, tuning in and out without particular regard to the plot… and yet we have all the cost of transmitting high-resolution video point-to-point.

      Surely we could just put better stuff on the radio, and accomplish most of the same goals for a far lower price?

      6 replies →

  • > - EVERYONE is going to be using max tokens

    anthropic already hunts down OpenClaw users for using too much on their plan.

    I'll give different example: When LED lights started to be more popular, the power usage didn't drop by the amount of power saved

    >- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

    Well, first, improvements in computing stalled or even rolled back just purely because price of everything compute shot up cos of AI and that will NOT be fixed for a while and ESPECIALLY if AI usage will continue to increase

    Second, the token per model might go down in time but better models have more expensive tokens, so we quickly get into spot when:

    * price increase in token might not be worth marginal improvement next, better model brings

    * more and more models are passing "good enough for the task" threshold so for less and less companies there is any economic sense to pay for the "best" instead of paying deepseek or some other company to run "previous gen" models

The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build. The more of the latter they can take on, the fewer knowledge workers are needed at all. So rather than 5% of every knowledge worker's salary going into tokens, 100% of the knowledge worker's total employment cost goes into tokens and you get a 20x productivity boost as a theoretical minimum across those tasks.

That's the game. There's a view you could take of this that this is just a growing of the pie: with those cost dynamics a lot more "small businesses" get a vast amount of leverage, so the overall economy grows without replacing the knowledge workers. I'm not sure I trust the MBA class to have that view.

  • >The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build

    I would argue that that's been the case for quite some time before AI. As an example, what innovative amazing world-changing products have Google or Meta launched in the past decade with their very high numbers of very talented and highly-compensated engineers? The issue with most big tech companies are leadership, strategy, and product direction. I'm not saying that they don't make any profits, just that they probably aren't "building [the right thing]".

    AI for product development and management would be far more impactful than automating rote coding tasks / building React UIs that mirror API structures IMO.

    • > AI for product development and management would be far more impactful than automating rote coding tasks [...]

      Yeah, if this stuff actually worked that well already, OpenAI et al. would just run AI CEOs and engineers. Why get some other company to pay you at all when you can automate every other company out of existence and take all the money they make?

      The fact of the matter is that while the tech has some uses, it sure as hell isn't a full scale replacement and you almost always actually have to massage the input into LLMs to get anything decent back out in practice. Some CEOs and managers can learn to do this, of course, and some already are... but that quickly turns into a second full time job. A "programmer" is still needed. The job might change from mostly hand-writing C++/JS/Python to prompt engineering + some manual coding to fix all the stupid fuck-ups that the bots can't solve themselves, but you still need someone to actually prompt the bot.

      When that changes, it won't just be engineers losing work; there will be no reason to even have a human CEO any more.

      4 replies →

    • I don't know, if you've ever tried to build something at companies of that scale you run into incredibly boring problems "what data table do I need for X" and "who is the right person to reach out to for Y" and "they aren't answering me I guess I'll have to escalate"

      I don't think there is any shortage of great ideas at these companies, they are just extremely bloated. And I don't think its something like indecision or bad PMs, it's "we have a finite amount of time and resources so we need to be conservative but also not too conservative"

      If you have AI systems that can simply build out POCs in days, backtest on real data, show reliable results and numbers, you get a suite of product options you were never able to get before. If you have coding agents that can speed up implementation, you can build more stuff and choose the things that stick.

      It changes the cost/benefit calculus of the entire business. I think you are exactly right in that: PMs/leadership are by their nature orchestration machines. Other roles are as well, but I think PM's are at a particular advantage here in that it will be quite awhile I would expect before core product decisions and creativity can be delegated to an AI, but not quite awhile until virtually everything that they're blocked on (legal approvals, POCs, wire frames, etc etc etc) will become less and less of a blocker

      8 replies →

    • Yes, that exists at the wider business level. No question. I think what needs to get asked is are we talking about a bottleneck within the business as a whole, or a bottleneck within the scope of the knowledge work in question. Within software delivery there's a very clear shift when it's suddenly trivial to drop a 100kLoC plausible-looking PR into code review within an afternoon. Producing working code with a whole bunch of tests which make a very clear assertion that it does, in fact, work has had (if you're going that way) all the human-scale thinking time taken out of it, down to a rounding error. It still needs to be checked by a human, which was previously assumed to be a comparatively quick task in comparison to producing the thing. At least, it does where I am, and I don't think that's a silly position today at all.

      If they can crack that latter review/spec-check/assurance step, checking that what was built was what was demanded of the problem such that we don't have humans in the loop at that step either, then the bottleneck moves again. Then I think it moves to requirements capture and to product development, but that might depend on the industry.

      2 replies →

    • > As an example, what innovative amazing world-changing products have Google or Meta launched in the past decade

      Kubernetes is at 11 years ago, and is huge enough to be included there. The Google Pixel was just under 10 years ago. So... not nothing haha

      1 reply →

    • Google's internally developed and sometimes even launched plenty of innovative new products in the past decade. Stadia, Fuchsia, federated learning, and the whole transformer architecture that underlies this AI boom are good examples.

      The problem is they get killed by some other executive who is afraid of their department looking bad by comparison.

      I think this is fairly illustrative of the challenges in AI becoming as impactful as the Internet. The bottleneck is not making things. There are plenty of people who are really good at making things and can easily be 10x or 100x as productive as the average corporate worker. YCombinator was founded on that premise - small teams of founders and early employees could be orders of magnitudes more productive than the 1000s of corporate employees at their competitors.

      The bottleneck is on bringing your product to market. If your innovative new product is built within a corporate environment, it'll get killed unless the executive you work under can get a promotion out of it, and you'll be denied all sorts of help with approvals, launch process, PR, marketing, branding, etc. If it's a startup, they'll try to shut you out with exclusive distribution deals, legal threats, lobbying efforts to change the legal environment, PR campaigns, FUD, etc.

      The Internet was revolutionary because it let millions of people bring products to market without asking permission. Instead of having to bid for retail shelf space among dozens of entrenched competitors that all had sweetheart deals with the retailer, you could just put up a website and sell it to anyone across the globe. Instead of following hundreds of regulations that governed existing commerce, you could just launch something and sort it out later. AI doesn't really have that property - if anything, it makes things more centralized, with more gatekeepers, and so seems more likely to destroy economic value than add to it.

      5 replies →

    • If you really think this you simply have no theory of mind for this stuff. There are tons of immensely successful products in the ad space that both of those companies have launched. They don't need to innovate in the product or technology space (doing so certainly makes a big difference in having more placement for ad real estate), but to suggest there have been no real innovations (specifically engineering specific innovations) related to ad tech would be completely ridiculous to suggest. You don't need to change the world to get rich, just look at wall street where major innovations have been made in the pricing models of fixed income securities.

      Second to this are countless other areas that have a major impact on the companies bottom line that are entirely engineering driven, especially at google given they are a cloud provider and have meaningfully grown the workspace business and launched waymo in this time.

    • >I would argue that that's been the case for quite some time before AI.

      I would agree but it's really minimized the building. More and more time is being spent on pre-coding work.

    • Google & Meta are illustrative of late-stage capitalism -- it's all about distribution, not innovation. Their job is (mostly) to just acquire the products that have passed the gauntlet, then scale up their monetization through their distribution-focused machine. The same dynamic plays out in virtually every industry (not just tech).

      You'll find that most internal "innovation" teams are just lip service. In most cases, the "mothership" will be incapable of reproducing true innovation -- from a statistical perspective, culture perspective (mega corps are anti-scrappy; internal politics), and motivation perspective (startups aren't 9-to-5). It's much easier to have big M&A budgets, a VC arm, and some handwavvy internal innovation group.

      Every now and again, you'll get real innovations (Waymo, transistors, GUIs), but even those have a spotty track record of commercialization when created internally.

      1 reply →

  • This is the same argument that has been historically made for outsourcing developers. Get 20 more devs for the cost of 1 dev in the US.

    I suspect that AI will fail to pan out to the same extent for the same reason why outsourcing hasn't fully panned out (even though every company tries it after getting big enough).

    The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.

    I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.

    I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.

    • Outsourcing of knowledge workers didn't work out because at large enough scales, the geographic arbitrage disappeared. Companies mostly always got what they paid for.

      The determinant of success was only whether the task needed American-tier labor or could make do with sub-American quality labor.

      13 replies →

    • > I suspect that AI will fail to pan out to the same extent for the same reason why outsourcing hasn't fully panned out

      My mental model for that is that outsourcing fails where the work is being done organisationally far from the knowledge needed to do it. We know that's true of teams inside organisations, there's been a lot of research on how distance in the organisational tree negatively impacts productivity. Outsourcing is a pathological worst-case of that.

      The promise (promise! We're not there yet!) of AI is that I can have a cross-functional team on my laptop. Organisational distance is zero. Where previously the outsourced team has to wait for the time zones to roll round so I can answer their blocking question when I get to my email STRICTLY AFTER I have had my coffee, now it's a prompt in a chat window with a button I can click to make a choice in 5 seconds. Delay is gone, cost of delay is gone.

      > The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.

      Oh, absolutely. That's a minefield. Today. It will be, right up until it isn't. There are ways to set up agents and projects right now that make a dramatic difference to how this part of the picture plays out, but those will sink into the harnesses as time goes on.

      But also the big problem with maintenance and outsourced teams tends to be the commercial structure around the contract. You get a Build team, who Build the Thing and then: no more features for you, anything you want to add past the original spec costs extra. They hand over to the Run And Maintain team, who get to fix all the bugs that the Build team left but without the knowledge gained from building the thing, but are scaled and located to be absolutely as cheap as the supplier can get away with so probably don't have the skill, inclination, motivation, or permission to take on any restructuring to make the bug fixing easier and they're on the wrong end of the globe so there's a 24-hour latency on any queries. It's a terrible way to set teams up, but it looks good on paper.

      Again, that's peculiar to outsourcing and completely goes away if I have the same team that built the thing own the thing long-term. That's true if it's humans or AI!

      > I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.

      No, it's a harness problem. You need to start from a maintainable point and keep standards in place. It'll take work to get the harnesses there and it's not ubiquitous. You might also need better models, but I've already personally seen big differences in outcomes between projects that took certain steps and others that didn't; it's nothing revolutionary, mostly stuff that works for humans also works for AIs but you need to know to ask for it.

      > I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.

      I think people radically underestimate the cost of delay. I don't know if 20x is realistic for the AI itself, but I think it's not impossible once the inefficiencies of having to go to other humans is factored in.

      2 replies →

  • Who pays for that value, and from what, if all knowledge workers lose their jobs?

    It sounds like the economy would largely reduce to the small minority class of independently wealthy people.

    • The more time I spend using agent tools the less I worry about knowledge worker job loss.

      It takes a skilled knowledge worker to use these things.

      6 replies →

    • See https://news.ycombinator.com/item?id=48300427 for an alternative take. I don't think either direction is inevitable, yet.

      To follow on from that comment, if the growth in breadth of capacity of AI leads to a decrease in the risk of running a smaller business, which I don't think is an unreasonable prediction, then it's not inevitable people do lose their jobs. Employers get smaller, higher-leverage, and more plentiful.

    • > Who pays for that value, and from what, if all knowledge workers lose their jobs?

      They do not care unless these companies can get a bailout.

      UBI only exists for companies that are too big to fail. Case in point, 2008 and SVB when there was too much money on the line.

      One of the AI companies attempted to guarantee themselves a way for the government to bail them out if they were close to defaulting on the debt from the data center build out.

      6 replies →

  • Producing a thing has always been cheap since personal computers existed. From mail-order software companies' times to SaaS times, producing a sellable MVP was an initial cost that is relatively small compared to the later cost of expansion and maintenance. Marketing and selling was and still is the hardest part.

  • Why do you think of knowledge workers as a fungible commodity?

    What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics? The roles within a tech company are not the only jobs in the world.

    • > Why do you think of knowledge workers as a fungible commodity?

      I don't.

      > What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics?

      Because it's probably already part of the job. It's a change of emphasis, not a change of career. Your boss can already ask you to do it. If you're producing code, you're probably also reviewing code, checking it matches the acceptance criteria, testing it, sanity checking that it was the right code to have been written, today.

  • > The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build

    “There’s more capital than good ideas to fund” has been a complaint from the likes of A16z & other VCs for a long time now. It’s why we ended up with stuff like NFTs getting funded.

I will also tell you, as someone who works at a company that's trying to remain profitable, that token spend has caught the eyes of finance and much like cloud spend they've already started applying pressure to control costs. This May my team is protected to use 30% fewer tokens than we did in April - this was by intention. I suspect we'll drop more in June.

> They've got, ballpark, $5t to $10t to make back in the next 5 years

OpenAI's spending commitment is in the ~1T range for the next 5 years, and Anthropic is ~300B.

If they continue to show strong growth, they likely need to be at 100-300B in revenue/yr to support their yearly payments + financing, not 1T.

> They've got, ballpark, $5t to $10t

What are you basing this on? For reference, Anthropic raised ~$70 billion in total and OpenAI ~$190 billion. Why do they need to make 20-40x that?

Yeah, claiming “product-market fit” on coding assistants for this multi-trillion dollar capital expenditure seems premature. Anthropic will post one and only one quarter of “operating profit” (aka losses after taxes and debt obligations) on the back of free-for-all spending by enterprise and engineer tokenmaxxing, neither of which will last. The investment was commensurate to a world-eating AGI, and if all that comes out of it is coding agents and slightly better enterprise software, I don’t think that makes up for the money spent.

  • The real question is: can you incentivize a non-tokenmaxxing Uber to spend the same amount on AI as they were when tokenmaxxing, just with fewer tokens and higher per-token costs? Even with plateauing improvement in frontier models? I think the answer may be yes.

    And part of my reasoning for this is: the only system capable of actually fixing bugs in vibe-created code is an LLM. If we humans couldn't write it without assistance, we certainly won't be able to debug it without assistance. So there's a real stickiness here.

    We're signing pacts with demons - we have to, if we want to outcompete the other warlocks - and those pacts are written in the very size of our codebases.

> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing.

We all have our own observations and mine don’t significantly diverge. But that’s bottom up. At this point shouldn’t we be seeing it top down?

If we are beyond potential and into significant productivity gains, why isn’t that showing up for the customers?

Why didn’t delta airlines get significantly more operationally efficient in the last 3 months due to the introduction of better software?

This is a genuine question, I am seeing a disconnect.

  • Anecdotally, my take on this is that biggest value lever is strategy and alignment, not implementation. The typical company is dozens of little vectors pointed in different directions, and they cancel each other out. Scaling up the magnitude of each is still net zero.

    I was recently consulting at org where two separate engineering teams were all in on two different, incompatible deployment platforms and using AI to accelerate adoption of each.

    Management was mystified why their engineering leads kept telling them they couldn’t deploy a complete implementation of their solution.

  • > Why didn’t delta airlines get significantly more operationally efficient in the last 3 months due to the introduction of better software?

    The coding agents got good in November. Most individual engineers didn't fully clock this until January/February. This means that companies didn't really figure it out until March/April.

    Assuming companies like Delta have adopted coding agents (which would be pretty fast) it still takes months from adopting a new tool to the code results of that tool rolling out to production.

    I expect (and would hope) Delta's software development culture is very conservative. Since nobody can confidently tell Delta "here are proven practices for using this tech to produce high quality, more secure code" yet it would be surprising if they were blasting full-steam ahead.

    I expect that even companies that got on board with coding agents in January will only just be starting to ship user-facing features that benefited from those new tools. Shipping software takes a long time, no matter how much faster the "typing the code in" bit gets!

    • >The coding agents got good in November.

      Maybe irrelevant to your point, but I'd argue they were really good already in May if one used the right workflow (planning etc.). They've become better, but they're not saving me significantly more time now than they did 12 months ago.

I thought Anthropic and OpenAI's combined CapEx has been <100B?

source: https://isaiprofitable.com/

  • That site needs Apple on the list. ;-)

    • Why? All their money is going to Apple Silicon and the five ecosystems, so far in Apples entire history, the largest acquisition has only been $3 billion dollars, OpenAI is currently getting nothing and they gave Google a measly $1 billion refund per year for the use of Gemini.

      If John Ternus wants to spend some money, spend it on bringing memory in house. Apple has the money and the engineering talent to do so, have it fab/made onshore in partnership with TSMC.

      Do it Apple because you have to not because you want to the Chinese probably will be taking over the memory industry, worldwide, by taking advantage of the greed from three memory companies and their AI overlords.

      1 reply →

  • Maybe so far, but they've committed to well over a trillion in future capex.

    • And there's the indirect capex that their revenues will pay for indirectly, like in the case of oracle

I don't think the maths works like that. They have raised ~$200bn so far and need to make that back. Saying they need to make $5 to $10tn isn't really real. They might need that to meet some extravagant Altman projections but not to justify what they have actually spent.

1. Global IT spend is $6T per year

2. Where does this $5T number come from? If they make $4T in revenue over the next 5 years instead, what happens?

> Most people I know cite +20%-40% velocity

Seems roughly right, that does seem to be about the boost in the most well-suited cases where you essentially know exactly how to solve the problem, the problem won't change much, and it's truly a matter of just churning out the implementation.

In that case precisely prompting, doing the review & nudge loop, can be a pretty nice (nice, still not game changing) speed boost over literally typing out the code to match the design in your head.

The less optimistic view though is that most things you build aren't like that. Even if they seem like it first. These things get booked as a nice speed boost, but you'll only find out much later they weren't.

A confounding factor is that it seems like many people not in the detail of building software do seem to think of most to all things are like that, even before AI assisted coding. Not much need to say more - see the entire history of the 'agile' movement for evidence of this.

And because most things aren't like that, I actually struggle to see fundamentally how more than 20-40% will ever be achieved (short of the ever-present deus ex machina of AGI argument), simply because the generation is already really good for these types of things. So since things like this aren't going to increase in overall proportion of things to be done, I don't see where the overall extra gains come from by models improving at this point.

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens.

They are assuming ~10% global GDP growth instead of ~3%. You probably don't need the same %s if the pie grows a ton.

I'm highly skeptical we get that growth, but if you aren't, it makes it easier to digest.

  • I mean this case with AI-productivity fires itself back when we talk about GDP.

    The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

    Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.

    A third effect also comes into play that once all this starts to happen, common people, who are generally living paycheck to paycheck, will now start to hesitate towards making any long term investment, housing included. And that indirectly will end up impacting financial and banking sector, which will then impact existing savings, bonds yields and retirement funds, and the recession-like cycle starts.

    This productivity increase only makes sense if it is capped to a very small number.. like 20% max. Beyond that, who these companies will even be selling to?

    Am I overthinking all this?

    • > The more AI causes productivity increases, the less and less number of workers will be needed.

      That only holds if companies have a fixed need for "productivity" which is met by their current employees, such that their employees becoming more productive means they need less of them.

      Every company I've ever worked for has wanted to achieve way more than they are able to get done with current resources.

      But generally yes, the biggest open question about all of this is how the impact will play out on the economy, job opportunities etc. I've not seen anyone come close to a confident prediction about how this will play out.

      2 replies →

    • >The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

      >Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.

      Big tech companies can't even create login flows and account recovery flows that work for everyone yet. There are countless stories of folks losing access to business Instagram accounts that get hacked, Google support from a human to fix a problem that is outside of their help articles is non-existent, etc etc. There's still so much "low-hanging fruit" IMO that isn't particularly fun or exciting to fix, but ask your average non-tech friend or family member what they think of the Facebook + Instagram security settings pages / sites / desktop-only settings.

      Who is going to pay for all of these subscriptions that will power this GDP increase when average purchasing power of those outside of the top ~10% of earners is decreasing YoY? We're headed toward food and water shortages next to sprawling datacenters, not shared societal prosperity and a healthy middle class.

    • First of all, common people are not living paycheck to paycheck in the sense that they're at risk of not having money[0]. This is corporate content marketing that has entered the collective memory of people, not anything close to reality.

      Secondarily, reducing the cost of making a thing doesn't always mean you get less of a thing. For me, certainly, what happened is that I write way more software than I originally did. When we built compilers, the amount of human engineering effort required to do things plunged, but the amount of software engineering jobs didn't go down.

      This is as bad as models will ever be. That part is true. And it's entirely possible we go foom. But it's also possible we don't, and then it depends on where the asymptote lands.

      0: https://www.slowboring.com/p/this-economic-myth-needs-to-go-...

      1 reply →

    • >Am I overthinking all this?

      Nope, if AI were to realise the hype, you have to take into account macroeconomics. Usually this isn't a problem for most businesses

      >The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

      People also underestimate that the reason why companies are so excited about AI isn't to increase productivity, its to fire workers and crack down on worker rights. They won't lay people off because AI means they don't need as many people to get the job done, they'll fire everyone while doing a much shittier job, because they hate having to abide by worker's rights and pay people

    • > The more AI causes productivity increases, the less and less number of workers will be needed.

      Why does this have to be the case with AI but it didn't have to be (and wasn't) the case with the steam engine, electricity, the automobile, or the computer & internet?

      Certainly, AI could be different.

      It's curious to me why the vast majority of people on here think it must be different.

      1 reply →

    • > The more AI causes productivity increases, the less and less number of workers will be needed

      This might not necessarily be true. Increased efficiency creates induced demand to the point where more workers are needed. Because the new capabilities unlock more value to extract and the economy rushes in to get it. The steam engine is a huge example of this

      I dont exactly know what new value genAI will unlock but i think its more likely than not

  • And yet the job everyone loves to hate, the humble "burger flipper", continues to resist automation yet command minimum wage labor rates. This future of either being a CEO of a company consisting primarily of AI agents building some monthly subscription-based solution to some trivial digital chores OR manual labor that isn't [yet] fiscally viable to automate seems quite bleak. We'd also need a ton of robot technicians and manufacturing that the US has neither the educational and training institutions to support nor the will of the population to fill. Given the ongoing war on immigration, visas, and foreign-made hardware, if this continues, good luck.

Hey, I wrote this down one time. I estimated way higher yearly revenue required, to be adversarial. And you can keep the "cost per unit AI work" a parameter and play with the results.

But the point is that if people are willing to delegate part of their salary (e.g., buy consumer products), vs requiring employers to pay for the tokens, then it's quite possibly a net win. Something like "I pay a largeish fee every month to make my own job much easier", similarly to how we buy a car to make commuting easier.

https://jodavaho.io/posts/ai-jobpocolypse.html

You are making the assumption that the models are only used / paid for by 2.5% of the population (your knowledge workers value). There will be new value created by these models which people are happy to pay for which simply did not exist at all before. It is also naive to say that the hyperscalers are going to be expecting a return on this in 5 years, it will be entirely propped up by investments / IPOs as has been the case with any tech company for decades now to reach scale. The hyperscalers are currently spending ~650b combined annually, which they have the cash for and can sell in future compute instantly.

  • I'm sorry, what the feck does "value creation" mean here? I live in a place where people are so, insanely squeezed from every angle. Wages are stagnant, prices rocketing. Where is the money to pay for this value going to come from?

    No one I know feels richer than they did a decade back. I've not been able to meaningfully put up my prices for a decade. People are tired and stressed and scared, particularly scared of a technology everyone keeps telling them will make them redundant.

    There is no rising tide lifting all boats, just most of us drowning whilst a few whizz past in their yachts.

    I honestly hope these guys faceplant ASAP. Couldn't happen to a nicer bunch of people.

    • Feelings aren’t fact. A lot of data shows the doomerism is not reflected in the actual numbers and much of it has to do with rapid inflation and continued vibes.

      Consumption has risen, inflation adjusted wages have risen for blue collar and white collar alike. Most social mobility has been the middle class moving into the upper middle class, not moving to the lower class.

      The main thing holding people back is the housing crisis. This is orthogonal to the value creation of businesses.

      Value creation is growth. If it didn’t exist the S&P would still be 42.55$.

      6 replies →

    • Sounds like internet sentiment and not research data.

      It's kind of become socially taboo to not be suffering "in this economy", but on paper it's hard to see weakness in places that there isn't always weakness. As long as the 65-95% are doing well, there isn't going to be a collapse.

      2 replies →

    • A literal example is that I can use AI to file my taxes instead of spending a weekend and hundreds of dollars to have an accountant do it for me. It costs me like $5. that 245$ delta is the value of that output to me, as long as I am confident it is correct.

      15 replies →

    • Thats the thing; the "increase in productivity" isn't being felt by the general public, the end user. If your "increase in productivity" just means more money being shifted around at the corporate level then it is meaningless.

  • > There will be new value created by these models which people are happy to pay for which simply did not exist at all before.

    True, but I think the GP's point was that what consumers will pay won't be nearly as profitable as what enterprises will pay to increase the output of their developers and knowledge workers. ChatGPT is currently the overwhelming leader in consumer AI usage but only ~5% pay $20/mo.

    As a recently retired serial tech founder, I'm now one of those consumers. I use AI webchat daily for general search, Q&A and even to write little automation scripts for myself, yet I haven't paid anyone anything for AI yet. Even after being heavily restricted and performance nerfed to hell in recent months, free webchat AI is still fine for everything I do, and I'm not remotely price sensitive.

    Even as AI compute costs fall over time, I doubt serving ads against AI webchat to consumers will generate the kind of high-margin, sustainable growth VCs get excited about. It's so undifferentiated I bounce around between all four leading providers because there's virtually no moat locking casual consumers to any chatbot beyond a single question thread. I guess if it had a nearly infinite context window seamlessly integrated across all sessions, that might be somewhat sticky for some consumers but it could also get creepy for some others - and it would devour gobs of the scarcest resource in AI. Beyond Maslow's Hierarchy of Needs, the mobile phone is the largest revenue, long-term mass consumer product ever but I just got a new flagship phone from a top-tier provider for $30/mo over 3 yrs. IMHO, even an all-you-can-eat, infinite context window, next-gen Mythos couldn't reach and sustain mobile phone levels of global consumer adoption at ~$20/mo. Unlike professional developers and knowledge workers, consumers don't have any "job to be done" big enough for an LLM to command that much of their zero-sum discretionary spend.

    • 100%, a driving factor will likely be how good we can make models that are so small they use almost no compute. Until then it is a race for adoption and moat-building (or screwing people over?) once you have users

      3 replies →

    • What are the non-tech people in your life using AI for? $20/month, next to Starbucks and avocado toast, is discretionary. Maybe the novelty will wear off and non-tech consumers will leave it in droves, but everyone declared they'd leave YouTube if they started playing ads, but YouTube doesn't seem to have noticed.

      1 reply →

  • > There will be new value created by these models which people are happy to pay for which simply did not exist at all before

    What sort of new value, and why will people pay for it from someone else rather than prompting for it themselves?

  • But will they pay big actors running top end models for that? You don't need latest openai or anthropic model to go thru your mails, get summary of the some products from web, or to do your to-do list.

    The AI might very well be used by noticeable % of population daily, but that doesn't mean they will be paying trillion dollars to the leading US AI companies

> They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

Depreciation and write-offs are about accounting models. Hardware will still be running after five years and still be making money. They may not be as efficient as the new hardware, but they will still be making real money even though they are valued at $0 in the books.

  • GPUs are driven really hard plus they use up a ton of energy and water, they cost a ton to run.

Also, not all developers work on software products. The vast majority of developers work supporting software solutions as part of a much bigger business model, such as infrastructure, industry, healthcare and services. Many of these are complex organizations. So, unless you get to turn every employee into a 10x employee, the 10X coder along won’t necessarily make a 10X productivity contribution. What’s likely going to happen is the 10X coder will start to slow down or adding more (unnecessary) complexity to avoid having to sit and wait on overhead, for other areas of the business which are not easily automated away to AI to catch up. As a developer I can finish my project in June instead of December, but what if the customer is still not ready for integration until December? what do I do?

Those are rookie numbers. We are going to blow past $1t per year in spending in no time. As a developer for 29 years, I couldn't go back to coding by hand. For better or worse, AI will be woven into the fabric of life in no time.

If people figure out how to run agents on-prem (already becoming feasible for both agentic tasks and coding on consumer hardware like Mac Studio 128GB+ or DGX Spark with some models) these companies will be in deep trouble.

Privacy is also a huge issue.

At some point, if we reach stability on the models, we'll start getting silicon optimized for individual models. They are optimizing for time to market, not efficiency right now. I don't know how much it will move the needle on the cost math, but at this scale any improvement has a crazy multiplier.

  • But that means going back to "80% profit margin" Jensen and further digging your capex hole. The benefits would have to pay not only for current capex but also past capex.

    But by then, I will be able to go one line down in my dropdown menu to switch to a newer LLM provider who doesn't have to amortize those past capex.

    • Yes, the whole thing feels dot comish. I’m betting there will be a few (maybe only 1-2, like what happened to search) winners, and everyone else is in for a bad time. It’s the same dynamic too: the winners are going to win so big that everyone wants to get their money in for a chance at a piece of the pie.

Anthropic raised less than 100B up to now and as of March has 30B ARR. Why does it have to make back 2.5T to 5T ?

I agree in principle with the math. But I believe that in reality if revenues don't show up quickly, then lenders will just restructure the debt and defer the payback period. Similar to SF commercial real-estate; many buildings should've come due during the depressed covid market, but lenders (banks) were willing to delay payment until the market picked up again.

The scale of these investments put the lenders at substantial risk, so the lenders will do anything to make it work. If the current lenders will be damaged by extended payback periods, they can simply sell the debt to someone else who won't be.

One factor to consider , the base will not remain the same over the next 5 yearts.

Every generation of developer tooling that increase of absolute code throughput creates a new class of developers (and users).

Always been the case since first compilers, through eras of frameworks to today, and the skill level needed to be one has dropped. In mid/late 80s only Master / Doctorate level Comp Sci professional could write any applications. It dropped to undergrad and just Information Technology engineers and comp sci theory became mostly optional and dropped further to any college level educated with some training and has been trending below with no/low code tools like retool pre 2022, that was before agent codegen services such as v0/replit and so on.

The next generation developers will not produce applications and architecture as previous generations did, just as we most of us here don't produce the level of quality that pg did when building this platform[1] , but as long as the user can find value it doesn't matter as countless enterprise applications of middling quality already prove today.

All this to say the 200M/30M numbers will not remain the same is the thesis for these businesses, will it change by large enough at a fast enough pace to justify the capex, I don't think so either. However web 1 then 2.0 , saas and mobile revolutions were pretty quick with new class of users and developers so not completely unrealistic .

[1] While HN is a heavy outlier with its custom lang lisp implementation, there are any number of examples from previous eras that are more moderate in choices but written with solid architecture with skill levels would be hard to find in today's generation founders.

I don't think the unit economics are too terrible. Expensive, but not impossible.

200m knowledge workers in US and EU. Total salary around $15T/year.

$1T/year in token spending is about $5k/year per person. A big number, but not totally mad. That's the low end for office space per person for example. Probably close to the existing SaaS spend per person for a lot of roles.

We are still early in the deployment cycle for these tools so I would expect them to get better and also cheaper too.

So you've got that market. Let's call it the demand BY knowledge workers to do the work. You've also got:

2. The companies themselves buying tokens for operations to make the work more efficent. e.g. Salesforce agent or Microsoft Office agent or random saas inventory agent. (and if you say those will go away (which I don't believe), it's even more bullish. The tokens just go to someone vibe coding XYZ, which is EVEN MORE than if you were to buy saas because it's SaaS product x Companies that built it instead of just one)

3. The companies SELLING tokens. This is also new markets like schools and small business (e.g. the local gas station buying an inventory tool)

4. The consumers "buying" (I put in quotes because it can be subsidised but the company) through chatgpt, strava, instagram/netflix recommendation, etc.

Local models still take compute, and while it may be cheaper, it is the same argument of on prem vs cloud. No one operates on prem unless you HAVE to for regulatory. Margins will come down and you just spin up a GCP/OpenAI/Anthropic agent.

It may be "cheaper" but rationally its better to pay someone to manage it. Thats why Hetzner only had $367M in revneue (a lot but tiny compared to managed services)

> 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

When you break it down like that it seems reasonable. I'm spending about $5k/mo on tokens, seems more and more normal.

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

Just realized something: if one worries about losing jobs to AI, token's high unit cost is good news. To say the least, high cost would delay the displacement, if any, right?

In the meantime, someone shared the below on X. I guess the moral of the story is that "good enough" does not just displace software engineers, but also models.

   > I Went From $3,000/Month on Claude to $5/Week on DeepSeek

   > And honestly?80% of my work is identical.

   > For the past two months, I was burning $3-5K monthly on Claude Code. Every idea from design to development to testing - full end-to-end automation, even simulating users to test my products and provide feedback.

   > Extremely token-intensive. But Claude's caching sucked, making it insanely expensive.

   > Then I discovered DeepSeek V4.

The most often cited figure for knowledge workers seems to be 1B, an order of magnitude difference to your assumption.

Also, according to https://isaiprofitable.com/ total industry spend is also an order of magniture less than what your assumption is.

So in your model 0.2% of knowledge worker salaries instead of 5%, IF all the AI players win the investing gamble and do infact make back their money.

Depreciation starts on day 1 and most likely they IMHO dont have 5 years. They dodged the deepseek bullet but who knows what is out there that will make all of this investment essentially worthless?

Plus, at some point there are less tokens because local models being optimised and can work with protected information. For enterprises that want an AI with a knowledge base of internal documents, this becomes more interesting by the day.

We're going to reach a point where these companies stop asking for money and start mandating it. They've got a vice grip around the nuts of many governments and loads of companies have gone all in on investing in these slop heaps.

At some point, companies are going to start removing basic features. Governments and essential services are going to make people go through chatbots to get basic service. They're going to require AI to validate stuff that's already automated and working fine. Google search? That'll be all AI (and I guess they're already rolling it out). Dentist appointment? Going to need to do it through some AI app that requires an account and tokens "for a better patient experience". Verifying your ID when buying alcohol? Going to need AI to scan it and take 90 seconds to determine whether it's real. And it'll say you're an 7 year old farm worker in rural Botswana, so you can't get alcohol. And they're going to milk money at every level of this.

YEPPP... and I'm kind of shocked at how many people can't do simple math.

Let's put it context. Google's annual revenue seems to be north of $400B. So if OpenAI suddenly had Google's revenue, it would still be insufficient to recover their investment.

and it's a ticking time bomb because $1T in servers, CPUs, GPUs and memory is going to be worth $200B in 5 years. You can say they can keep using what they've got. Sure. But they're also not going to stop spending on new hardware. And the competitor that comes along in 5 years and spends $1T doing the exact same thing is going to have a huge advantage.

OpenAI at this point reminds me very much of the Russ Henneman pre-money hype cycle.

  • It's actually worse than that. It's not just financial depreciation or that the existing hardware becomes obsolete due to being less powerful than new hardware but also that hardware being run all the time at high load actually has a limited lifetime of a few years so it will physically break...

    • I agree but it's even worse than that.

      Data centers come down to performance-per-Watt. Electricity accounts for 20-30% of a data center's operating cost [1]. I don't know the exact breakdown but the GPU part of that is probably the majority given how power hungry GPUs are. The B200 is upwards of 1200 Watts [2]. The B200 is rated at ~4.5PFLOPS of dense FP8. So you're getting 3.75PFLOPS/W. We don't know what the next generation will look like. The A200 (Hopper architecture card that preceded the B200) had ~4PFLOPS apparently but also lower power consumption. Obviously this changes depending on whether you're looking at dense or spare and FP8 vs INT8 vs INT4 vs FP4, etc so we're just using FP8 as a yardstick.

      Imagine a fictional B200 successor, the T200 that has 8PFLOPS of dense FP8 at 1000 Watts. Well then a DC built on that where the T200 will likely cost similar to what the B200 does now, you'll get nearly double PPW so the same size DC and same electricity load is going to be like 2 of your old DCs in operating costs. That's a big deal when you've laid out a trillion dollars.

      [1]: https://iaeimagazine.org/electrical-fundamentals/how-much-el...

      [2]: https://www.trgdatacenters.com/resource/h200-power-consumpti...

  • Prices are not going to stay where they are.

    You have either never seen a tech cycle, or need to be reminded of that. The pressure to buy more expensive plans is already starting to form.

  • This should be the top comment. Also, I think its not that many people, including our Simon here, are not good at math. Its more like, some of them seem to be incentivised to not be cough, cough, "good at math". How else will the hype sell?

    • At a certain point, I genuinely feel like the best way this hype is being sold is by making people genuinely believe in it.

      and in that sense, if Anthropic and OpenAI are able to create the projection that they can-be profitable despite finances seeming bubbly at best, I think that what happens is that these companies spew so much amount of content that people like Simon get into it too.

      There is a deeper problem of people falling into AI psychosis too, in general, I am not sure if Simon has fallen into it or not

      I think that the greatest point which can be made here is to not offload your thinking to others and to think about the situation yourself. Sounds familiar (looks like we are all off-loading our thinking itself to machines)

      Side-note: As humans, we have a tendency to quickly judge or make quick decisions which stems from our times foraging and scavenging in jungles.

      Another Side-note: at a certain point, I am unsure of how much to think about AI or not, certainly discussions about it that were happening 2 years ago weren't helpful in contexts that they are used now (well not in any way or form that a person discussing and getting into the weeds of AI 2 years ago is better than a person just getting into it say 2-3 months ago)

      With the industry (moving so fast) [but that doesn't mean that you can't catch up with it, I feel like the fast word has made people think that they are falling behind which is imo wrong i suppose]*, It is basically unsure to me of any FOMO or anything if you aren't using AI already, I find this notion naive.

      People might be making strong opinions (AI psychosis) and skills on the tools available at the moment the same done 2 years ago. We don't quite know about the tech as these are still black-boxes and how they progress and what these "AI skills" might survive or not in future. Heck, we aren't even sure if these tools might survive or not or wouldn't be made magnitudes more expensive simply to break even as they are given to us for the first time at percentages of the price.

      I don't know if I should form (strong) opinions yet and also a question of its worth so much thinking efforts in the first place, probably just gonna do my own thing (the way I want to) which includes learning C at the moment. because learning is fun.

      6 replies →

I could see such productivity gains being possible, if only because the current tooling around LLMs is terrible. The fact that we have 30 blog pieces per day making the front page of Hacker News about someone’s convoluted system to guide LLM output to something reasonable is absurd. There needs to be standardization in tooling, and it needs to be open source. Then, and only then IMO, will we see huge productivity gains.

But, at that point I think the big players’ moats will have dried up. Local models will probably be sufficient for 99% of daily office worker tasks.

So I disagree with TFA’s premise. I think this fear is probably shared amongst the LLM giants, and they’re still hoping that neural network transformers are somehow the path to AGI (probably not, imo).

Not to mention the competition: chinese open-weight models and open-source harnesses. Qwen3.6-(27B and 35B) have proven to be worthy and capable of running locally. I am confident more SMEs would look into this as a solution given the ballooning costs of API usage. You get a decent setup with an RTX 6000 Pro.

> 5% of every knowledge workers salary to go into tokens

In general, I don't think you can reason from the existence of potentially stranded investments back to revenue projections.

And when you frame this as percentage of salaries, that's a sneaky implication that this is only about reducing salaries and headcount, and not about adding capability, or doing things you couldn't do before, or making fewer mistakes, or capturing more revenue, or expanding margins, or competing more effectively.

That said, 5% of knowledge worker comp actually seems very low to me, given the capabilities, and considering the percentage of "knowledge work" that is absolute bullshit.

Two weeks ago I received an email from my HOA saying I'd been billed for a service I never asked for. So I replied to the email saying they'd made a mistake. There are now more than 30 messages in the thread, involving at least 8 "knowledge workers" at the property management company all passing the buck, and the problem is no closer to resolution.

An agent could wipe out all 8 of those bullshit jobs and solve my simple problem in five minutes instead of two weeks. Think of how many hundreds of thousands people are doing this nonsense just in the property management industry alone.

5% is nothing.

Is it possible that you are narrowly sizing the opportunity? While PMF does not always mean that early pioneers will be the leaders, I think the market itself goes beyond knowledge workers and developers. Agents, robots, drones etc will all use LLM or some world model.

I am rather more concerned about competition from CHINA. With how Huawei (2000 -> 2020) crushed every other telecom company and went from nobody to the most revered leader in 20 years, and with the depth of leadership in manufacturing and work culture, if China surpasses USA in AI, all US companies lose.

This is why 'agents' are the solution for these companies. Token spending goes through the roof. As long as a human is in the loop needing to read or review at human speed, that's a ceiling on how many tokens per user they can generate.

"5% of every knowledge workers salary to go into tokens. 20% if you're a developer"

Not unreasonable. I'm a hardware developer, and my employer spends ~10% of my salary on software tools. Add hardware tools and their maintenance and it's more like 30%.

What value do the big model makers provide other than having a head start on gathering up humanity’s IP to train their proprietary models?

What’s their moat? Is it hoping for regulatory capture where scraping is made illegal the day after they finally finish scraping all human language?

It’s like OpenAI dammed the Colorado, and Anthropic dammed the Hudson, and now they’re both trying to sell us bottled water subscriptions at $100 a month. I don’t know how well the dam part of the analogy holds up, but the water part feels strong. Compiling models based on humanity’s written output feels like something no corporation should own.

This assumes that we won't need new hardware in ~2 years. I find that unlikely. So they have to make back what they got up until now PLUS the running upgrade/development costs. So what will it be in 5 years? $20t? $30t? It's all getting a bit outlandish.

What I'm often hearing though is the equivalent of "gg ez" when I bring that up. I don't understand how this will at any point blitz scale to profitability. As far as I know they don't have positive cash flow, no one has a moat and I don't think they will push out engineers.

I hear this and I keep wondering what I’m missing. My productivity has shot through the roof over the last year as a result of having these tools. I’ve been able to unlock projects that I’ve wanted to do for years.

There's a lot more things that are going to be built that weren't built before as well.

To get that revenue and adoption they have to vastly increase their infrastructure spending. If they are currently losing in even the 200/month plans how is it sustainable?

> 200m knowledge workers in the world, 30m developers

Your scope is too narrow. The companies target more than white-collar jobs. And $1t is around 0.5% of the world economy.

> 20% if you're a developer. That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

Of course it will. The value of an employee is a multiple of what they get paid.

If you pay an employee $500k and they make $2M for your company (like Meta), then of course a 20% increase for the salary is justified if the velocity is increased 20% as well.

  • The difference between what the employer makes per employee and what they spend in compensation doesn't matter. If the increase in productivity isn't greater than the increase in cost, there isn't a reason to pay for AI over hiring more developers.

    Imagine an employer with 10 employees paying $500k per employee and making $2M per employee in revenue (to use your numbers). They could hire two more employees and spend an extra $1M (+20%), but make an extra $4M in revenue (+20%). Alternatively, they could buy all ten employees a $100k AI subscription, for a total of $1M extra spending (+20%) but an extra $4M in revenue (+20%). You'll notice both scenarios are identical, so an employer optimizing for profit would have no reason to prefer one over the other.

    • There’s a lot relationship and culture management overhead involved when adding 2 more people to a 10 person company. I think any business leader would take the productivity speed up from buying a tool over hiring more people and integrating personalities/habits/viewpoints to an existing established culture any day of the week.

      1 reply →

Also hardware will be obsolete or dead in 5 years, and warrantys are 3 years from Nvidia. Ask crypto miners how these kind of hardware economics work. Numbers have to keep going up all around. Its a fundamentally broken business model unless prices increase 10x

> 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

This is where the napkin math is breaking down in a big way. There is absolutely no reason to assume this will only impact "knowledge workers". Farmers use computers. Farmers will use AI.

  • AI for what? None of the AI a farmer could or would use would be any more meaningful that light chatbot usage or already existing computer vision/gps

Also, with announcements of replacing developers with AI and consequent job losses, who is going to use the tokens? AI using its own tokens to produce code?

My hope is that hardware improvements (better node densities every 2-3 years, better designs, etc) will pick up the majority of the savings for these companies in the future, assuming LLM performance starts to taper off with diminishing returns.

> ... against the actual work their company cares about doing. [...] stuff that matters

This is a key point. Some engineers are having fun doing e.g. greenfield stuff with AI that they never would have had time for otherwise. Whether the company cares about that is another question.

It's related to Goodhart’s Law. If AI token usage is a target, then you're going to get a lot of token usage, but it's not likely to correlate well to improved business outcomes.

That’s on the order of 1% to %2 of global GDP per year just to pay for their hardware commitments.

> They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

I find it disappointing that a completely wrong statement like this ends up the top comment on HN.

It is wrong in both the math, the logic about public markets and understanding accounting.

> $5t to $10t to make back in the next 5 years

I don't know where this number comes from, but it has gone unchallenged.

OpenAI and Anthropic combined have raised around $100B. This is an investment so isn't something the have to "pay back" from earnings - instead investors expect to make that back from the share price being higher than what they paid for it.

> or the hardware buildouts will start getting written down.

The hardware buildouts get written down anyway!! That is a good thing for investors because as the value gets written down they can book a tax loss. ANd it turns out that generally agreed depreciation schedule for GPUs (used to be 3 years, now 5 years by places like Coreweave) is still too conservative since GPU rental prices for 5 year old chips are higher now than when they were new (!!)

All of this makes the rest of the math in the comment incorrect by at least an order of magnitude and under some scenarios possibly 2 orders of magnitude!

That's not a small error!

It's worth noting that if each developer is 20% more productive with AI (let's take that as a premise and not dispute it), then it makes sense to go even further and reduce human headcount by more since the communication overhead of having 25% fewer developers is in and of itself a force multiplier.

tldr; 10 developers with 20% more 'productivity' can be replaced by 7.5 ideal developers and more like 6 or 7 developers due to the benefits of simply requiring less organizational communication.

I still think the ideal team size is unchanged however and that's 7-10 people. Note that teams aren't necessarily the same as direct reports. A CEO for instance has a certain number of reports and a leadership 'team' but they're not a team in the traditional sense since they are more about making good decisions and collaborating on specific things but mostly about leading their own orgs that have vastly different skillsets from eachother.

There is also the EV (expected value) of developing AGI. Even if you personally believe the probability is low within the lifetime of either of these companies, the value would still be extraordinarily high, enough to forgive a $5T or so miscalculation here or there.

  • I don't think AGI was ever a serious endeavour, just something the labs talked up to grab attention.

    I am willing to bet a Twix we'll look back on that stuff in 2 years with a lot of embarrassment

    • The high-risk side of that bet would need to win more like a lifetime supply of Twix. But in a post-scarcity nirvana, everyone already has that. So sure, you're on at even money. See you in two years.

      2 replies →

  • Only if running AGI makes economic sense. We actually have no idea if that’s the case. We don’t even have a definition for AGI

> 200m knowledge workers in the world, 30m developers

1 in 6 knowledge worker is a developer ! Surely that’s too high thou explains the job market

Somehow Uber and WeWork survived the same kind of grand projections that they never met.

  • uber sure....but how did wework survive? they are a smoldering husk of a failed company looted by its founder

    • The company’s gone but the assets just got sold to other commercial real estate firms.

      Uber was basically only ever software to help people use their own cars so a very small part of their valuation was physical stuff to upkeep, it was just deals and obligations they had.

      Not sure how it shakes out for Anthropic and OpenAI. There’s a lot of physical capacity that needs to be built out and can depreciate. But there’s also a lot of network effects and dependencies being built in with enterprise users.

      I don’t know how swappable the tooling is either. I think over the long term the UI, model training and documentation, and infrastructure are going to end up being run by different parties and I’m not sure which leg of that chain ends up in a position to skim most of the profit off. My guess is that Apple and Google end up raking in all the money since they control the OS and app stores while the rest of the stack gets driven down to being generic commodities. At least where mass market consumer adoption is concerned.

  • The difference is that they had room to charge more of their customers and pay less to their workers. The AI industry doesn't have both sides to play at this point. Training and inference are getting more expensive and if you take on the high prices now you're just floating yourself further downstream from profitability long term (which does not look viable for any of them currently).

  • Funny you should mention Uber. What was it their COO said recently about the AI costs?

    • I quoted exactly what they said in my piece, under the heading "The AI-failure stories around this are pretty thin": https://simonwillison.net/2026/May/27/product-market-fit/#th...

      > But then you sometimes go and talk to your senior engineering leaders and you’re saying, OK, how many projects that were on the cutting room floor got moved above the line because of the productivity gains because 25% of our code commits were via Claude Code last quarter?

      > That link is not there yet, right? I think maybe implicitly there’s more that is getting shipped. But it’s very hard to draw a line between one of those stats and, OK, now we’re actually producing like 25% more useful consumer features, right? And that line is hard to draw.

      That's pretty weak sauce. I don't think that justifies the headlines that came out of it, personally.

      2 replies →

  • somehow the invisible hand of the market is also blind af

    • Makes sense if you think about it: if all photons pass through you (invisible) then you can't capture them to get info (blind).

I just don’t understand how people are getting negative value out of AI or even only 20% productivity boost. I can only conclude that people don’t know how to use agents.

  • I mean, it doesn’t really matter if it caused by people failing to use the agents well or not. You cannot assume everybody to use the technology the best way possible

  • Are you mostly creating new things or integrating with complex, undocumented, untestable systems?

    • Mostly brownfield systems in Java, Elixir and TS. I use OpenSpec in explore mode and point the agent to all the different repositories (when not working in a monorepo) to identify changes. Once done, i switch to propose mode and spend at least 15 minutes there iterating over the plan until I'm satisfied with the TDD approach (agents need tests to verify their work). Then apply and review. This also auto generates docs etc.

lol I’m spending max $50/month right now on a couple light subscriptions and my velocity is insane right now (full stack mobile app development) I’m leaning into it hard while these cheap plans still exist and building out a big platform that I can easily generate new apps from. Hoping by the time the rug pulls I can just go back to hand cobbling these apps together from the modules I’ve pumped out and never even consider giving these companies a massive portion of my monthly income

there are many paths towards ROI and ruin. but towards ROI:

+ LLM-powered robotics, autonomous, IoT, smart manufacturing

+ LLM-powered biotech, healthcare, genetic engineering, medicine

+ Recursive model improvement

+ Multiply the # of devs (software truly eats world)

+ Exponential increases in model performance / cost decrease (algorithms, power, infra, chips, architectures, etc.)

Author seems strangely unwilling to distinguish usage from profitable product market fit. And from his own numbers:

Anthropic Max: $100/month

OpenAI Pro: $100/month

Total paid: $200/month

API equivalent usage: $2,180.16 in 30 days

So paid only 9.17% of API-priced value a 90.83% discount, or about $10.90 of API priced usage for every $1 paid...

That proves heavy usage but not sustainable unit economics.

Anthropic reported numbers point the same way:

Q2 revenue: $10.9B

Adjusted operating profit: $559M

Margin: 5.1%

SpaceX compute: $1.25B/month = $3.75B/quarter

So one compute supplier alone equals 34.4% of quarterly revenue and 6.7x quarterly adjusted operating profit.

Its difficult for the blogger to understand something when its incentives depend on not understanding it...

  • My point with the $2,180.16 thing is that the price for consumers like myself is heavily discounted... but the price for enterprise companies is not discounted.

    My usage is therefore a useful indicator of quite how much those enterprise companies may be spending on tokens, given the new pricing scheme.

    If enterprise companies were still getting the same discounts that I get myself I would not have written this article.

    (I had to dig into your margin figure - looks like you calculated 5.1% as 559000000 / 10900000000 * 100 but that $559M "adjusted operating profit" figure includes training costs, where usually when we talk about margin on inference we're not including those since those costs are fixed, margin calculations make more sense against the variable costs of serving a token.)

Now try to take back llms from developers and see what happens.

  • If, by some miracle, all LLMs ceased working right this second, any developer who would no longer be productive should not have been a developer in the first place.

    • You don’t need a miracle, if Anthropic API is down due to technical issues you don’t have software development anymore. It’s insane how much we are delegating to 3rd parties. It’s not like having cloudflare down where your users cannot access your services. The AI tools used to investigate prod issues stop working, developers stop working. The AI support system that allowed the company to get rid of their support team stops working. In addition to all the issues that causes to customer facing products based on AI. The sales team cannot work anymore.

      It’s like the industry is willingly introducing a common external risk to everything

  • Limiting token quotas would be fine. Encourage developers to use efficient models, plan the work first, and to not burn thousands of GPU hours on waste.

    It's much like when developers would waste tons of money on AWS spinning up massive test VMs and leaving them running without care. Until the finance people cracked down on it.

we all know it is impossible goal to make. surely AI will be even more useful in the future, but as long as china exists and continue to undercut the price, the goal will be never meet.

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

with that much money, the companies can easily buy their own hardware and hosting free public models, no need for those expensive subscriptions.

> unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

Simple - you make them work 2x, 5x, or 10x more hours.

> +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

Except that if your company go 20% faster than the others companies, you win market shares. But then, everyone will use the same tools and companies will be at even speed, but the tool will stay.

Now...if the market is saturated, it's useless to try to do things faster. Cheaper yes, but not faster.

  • Pretty much all major tech companies today are horribly bloated and mostly metastasizing instead of innovating. I'm not sure how 20% increased productivity will help in any way with that. If anything, it might accelerate enshittification and turn potential customers off even more.

One thing I genuinely don't understand is these companies are constantly taking in incredibly large amounts of investments, so presumably they're giving up large chunks of equity or these are loans that need to be paid back or they're committing to spending obligations they're very unlikely to be able to meet.

So besides the insane hardware buildouts you're correctly mentioning, I don't understand how anyone that invests in these companies is supposed to make their money back in any sort of reasonable timeframe?

The cynical part of me is looking at what happened to the NASDAQ rules recently where essentially index funds are going to be forced to buy SpaceX shares much earlier than they previously would have (ie, before the price has a chance to reach it's real valuation). Which, um, I'm guessing these stocks are going to drop pretty hard when people start looking at the financials of these companies.

My suspicion is that the point of these IPOs is essentially to dump the bill on the unwilling public by forcing various institutions to buy it (ie, your 401k or pension is buying this shit), and maybe their investors can squeeze some money out of this before the stocks reach an equilibrium that's probably like 1/10th of what they're "valued" at.

Let's skip to the part where they put the taxpayer on the hook for a bailout as an industry since they integrated everywhere with big promises

> make developers 2x, 5x, 10x as productive on stuff that matters

What does this even mean? Is this about speed of development? Is this about headcount? LoC? How are coding agents contributing to productivity in places like GitHub, Shopify or Meta? I mean companies that already have an established product. I really wanna understand this because I'm not seeing that GitHub's product suddenly became so much better than it was 2 years ago, so where's all that productivity going?

  • People have been arguing about how to measure developer productivity since time immemorial. But the bottomline is that if products and features are hitting the market faster than they used to, developers are doing more with less. It's what we're seeing in my workplace.

  • The productivity is going into perverse incentives[1], e.g. we have improved (by which I mean "increased") token use. More PRs every day. More lines of code. All things we knew were shit-brained metrics a decade ago (obviously except token use).

    We've also increased how much our coworkers need to read, or deal with. You can get an AI to make any point you want, so you can ignore the 5 humans raising alarms due to the 1 clanker you made say what you want to hear.

    All numbers going up.

    There are obviously people producing additional true value with it, probably, but that's almost certainly scarce.

    [1]: https://en.wikipedia.org/wiki/Perverse_incentive

  • Productivity is measured in the number of AI-generated Twitter posts developers can make about their AI-generated startups

...does anyone have a guess as to the total amount of money spent on software developer salaries each year? What percentage of that would the AI companies need to capture to be profitable?

(I'm not trying to imply that LLMs can replace software engineers, it's just an interesting comparison. If nothing else, I suspect that if the cost of development goes down, demand for custom software will go up.)

Bigger than that, they have to contend with open weight local inference. Open weight models right now haven't caught up to the frontier models of right now, but they're as good as the frontier models of not too long ago. If open weight models reach a certain point, then frontier model providers are going to struggle to make anything selling tokens, because eventually people will realize they don't need Mythos for everything.

I assume the bet is that as you swap humans for machines, this pays for itself. Swap entire devs and teams and frankly, managers, and you make up a lot of 5%’s fast.

If it works. And I’m not sure who is going to buy the stuff the machines produce, but shrug. Presumably some bots click ads for NFT’s that other bots generate.

I understand some startup deciding to take a punt on "this will all work out financially if our new product demonstrably boosts productivity of large sectors of the economy by a breathtaking factor that's incredibly rarely ever happened before in history: 2x. Sometimes a plucky group of people take a risk, it pays off. If it doesn't work, the company fails.

What I do not understand is: large sectors of the economy all simultaneously taking this punt, with the necessary productivity boost, as you say, far more like: 2x, 5x, 10x

imo if your developers arent at least 2x as productive, then something is being done wrong on the employees part and/or the organization's. cli tools are ridiculously powerful provided you were an actual developer before using AI.

Maybe it's just me being (trigger warning from me providing an honest self assessment) very intelligent + a generalist, but i went from only full stack webdev and .NET to being able to implement an end-to-end LLM training pipeline (data curation, tokenizer, pretrain, sft, DPO - using ~$100 in cloud compute to train a class-competitive 1B STEM model)...and a full economic financial modeling and quant analysis application that pulls up to date economic, economic, news, stock data from the entire world and uses Dagster to orchestrate tech ical indicators and fundamentals and signals... and i did these things for learning and for fun. i built my own sublime text and obsidian replacement. i built my own reddit/twitter/hackernews/substack/news aggregator. i built countless other useful tools and utilities for me personally and for work I build more that empowers multiple departments.

Ive built 2 browser games, one already released to great reviews and 100k+ hours played. Ive built a tool on top of claude code that does ~60% of my job. Ive run data analysis on company financials for forecasting that have been refined and are producing very accurate predictions. Ive built competitive analysis tools and trackers.

All of this in 3 years. The projects are all clean, documented, with great code practices and modularity. A purist would surely consider some of the code slop. But it all works completely and fills real needs.

This is a huge shift. Anyone not realizing it yet is just simply behind the curve. I would not have accomplished 1/10 of this without AI coding. I went from copying code into and out of browser chats for 2 years before getting on the CLI train, and it is absolutely ridiculous the ROI you get from subscriptions to Claude or Codex.

they need to make 5t-10t back, but not necessarily through selling tokens. as we can see, the frontier labs are making vertically integrated products. their revenue is no longer strictly tied to inference.

That assuming once they start squeezing people won't just go to deepseek or other cheaper competition

> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

And most research shows people far over-estimating their own gains. Once companies start counting the actual (and not just reported) gains, the AI budgets will be more limited as people realize it's an useful and versatile additon but not replacement for most types of work

> We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

Upswing of the hype cycle while growth of tech itself is flattening, both coz of techs innate issues (which might or might not be solved, but some papers claim they are unsolvable with current approach) and just the fact the spike in growth caused so high economy cost that it put brakes on itself.

T

  • There's a lot of workslop pumping the numbers. People can generate a 300 page PDF in a tiny fraction of the time it would have taken, but now the report is full of mistakes and fluff, and the stuff that would have been learned and caught in the process of making the report is now not happening.

    • and the recipient pulls that into LLM and generate summary.

      It's lossy compression for thoughts at this point

Here is a serious question.. Can we sell into the hype cycle and on the way down with this: https://safebots.ai/costs.html

  • I asked claude to generate a frontend and it made the same template. Same san serif and serif fonts together. Same colors. Same typography. Same layout and animations even. It’s wild how similar it is. No not similar it’s the same damn thing.

    • I’ve seen the same dashboard for a dozen custom web applications now, including a couple I had it make for me.

      It really does have a particular lane for each chore, and it’s reproducible.

      1 reply →

    • It produces the "most average" web design unless you really prompt your way out, isn't it? If you don't care enough to prompt, Claude does not care to be individual.

      1 reply →

  • I don’t think these numbers are accurate? It seems to ignore the fact that the models have cache for ongoing sessions, which means you (normally) aren’t actually sending all those tokens on every request… you only need to if you go too long between requests.

It's going to be a typical saturation curve. A lot of upfront tokens spent on things that have stockpiled over the years, and then the derivative on token spend trends to zero as the users run out of immediate things to try. Sure there will be ongoing maintenance and experiments, but it wont be nearly as close as the initial inrush.

Given what costs are and availability of parts, that 5 year write down is not in practice going to be the case. Maybe tax wise perhaps but especially for big fancy expensive multi million dollar 100-500kW racks these things are going to stick around for a while, I think.

consider cloud spending vs on-prem before the great cloud migrations. people are spending a lot more for cloud services now.

I hear conflicting things about finances, some have a different opinion, that it won't be written down so long as more funding comes in and revenue keeps increasing. it isn't like how you take mortgage or business loan, it isn't even a loan it's an investment funded by loans. So long as the investment is still promising, what are they going to do? destroy its value by calling in trillion dollar loans?

> $5t to $10t to make back in the next 5 years

Wait what? They spent 2 order of magnitude less on hardware.

  • From the verge: https://archive.is/kU4Zg

    > Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve “historic returns,” the providers would need to earn nearly $8.2 trillion in the same period.

    • The numbers are made up political correctness anyway.

      Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.

      We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.

One quick question. Did tax payer money fund these data centers? If so, how does that money translate to their profit and a return for the people whose work paid for the resources?

Or did we just get scammed?

Source on 200 million knowledge workers worldwide? My understanding is that it's just above 1 billion. I dont think a billion subscriptions at $1000/yr is out of the question but it might take a decade to get roiling

  • You're suggesting that 1 in 8 people worldwide, including every one from infants and the elderly, are knowledge workers. Are you sure that's what you mean?

    I'm not even sure that 1 in 8 people I know would qualify as a knowledge worker, let alone a knowledge worker that might profoundly benefit from on-the-horizon AI. And I'm in a highly skewed population.

  • A billion subs at 1k a year????

    I see a lot of out of touch takes here but this might take the cake

  • A billion? Really? At 200M you’re already including a lot of people that stretch the definition of knowledge worker.

    • > At 200M you’re already including a lot of people that stretch the definition of knowledge worker.

      How do you know this? Im certainly open to recalibrating my numbers which is why I asked for the source

      9 replies →

    • A lot of those ‘edge cases’ in the definition of “knowledge worker” are probably the stuff that’s most likely to have significant parts of the work augmented or replaced by AI agents. Like, call-centers are almost certainly going to get turned over in a big way. It’s not like the median tier-1 support operator just reading off a script is much better than an LLM anyway.

> +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

I'm increasingly realizing this math is wrong, because LLM use is really sticky.

If Anthropic 100x'd prices tomorrow for their best model, so some companies offered 50% salary to keep 100% of your AI usage:

a) There are programmers who would take this deal. They've gotten to the point of doing what feels like even less than 50% of the work, developers were already pretty well paid, so they'll take it.

b) There are companies that'd offer this deal. Even if the only people who are taking this deal are not the best engineers, and the AI output is not the greatest, I think the last 6 or so years have seen a lot of companies realize capitalism is not as competitive as it seems.

They're not worried about putting out a worse product because... frankly, what else are you going to do? CF lay a bunch of people off, support gets awful: well you're probably not building a new Cloudflare in the next few years.

In the meantime the AI will get incrementally better, their market share will grow, and you won't be able to compete without taking the same faustian bargain.

-

Maybe I was just naive but it's making me realize how much we take for granted in the world. Both the quality and relative value of things don't have to go up over time. Quality can go down while prices go up, and nothing will really stop it. Competition should stop it, but competition is really slow and can be interfered with. And as prices go up competition gets really hard.

> We're not there yet.

And that's not considering that capitalism is going to do what it does best: if they really found a way to be profitable, competitors are going to fight them on pricing. Anthropic, OpenAI, Google, etcetera 's margins are a competitors' opportunities.

It's not as if there weren't chinese models nearly SOTA. Don't know where the french (Mistral) are but they may try to get in the game if there's a way to be profitable (not that France or the EU for that matter are relevant in anything tech or had any tech company besides ASML and SAP in the Top 100 but who knows).

Your severely underestimating the idea that people are just not going to use developers for certain things in the future

For example I don’t anticipate somebody making a living off of making website ever again

Somebody with absolutely no technical experience who needs a website for their business can now make one with almost no money whatsoever.

That’s good enough for their business. and the code can be totally shit and it does not matter because it’s meeting their business objectives. I am seeing this in the wild and I’m paying money to companies that have these types of websites and because it doesn’t matter I don’t need for the website to work perfectly on all my devices all I need to be able to do is pay them through the website which is what they need me to do and our transaction is done.

Don’t forget ultimately the people who pay technologists right now are primarily advertisers

work on hard problems is going to continue to be some tiny fraction percentage of the software engineering discipline

just expect a total bloodbath because the goal isn’t developer productivity the goal is that “I don’t need to pay somebody $200,000 a year to build a website authoring tool like WordPress.”

  • Why would a small business use a coding agent to build a custom website when they could use something like Squarespace or Shopify with prebuilt templates that mean they have to know even less than if they were to use some kind of chat UI?

    • Cause it’s still easier and cheaper apparently

      This is the most recent example I found last week for a local barber:

      https://manus.im/

      And my other assumption is that it immediately integrates with IG/Facebook which is where they do a lot of their marketing

      I see no reason that trend is going to slow, especially if you can go to meta to manage your entire business marketing.

      Regular people running business just want fast cheaps and good enough.

This is never going to materialise. It’s dead in under 2 years.

The market is shrinking and saturated already and it’s not because of AI gains but geopolitical instability and supply chain issues, some of which are caused by AI spending and stupid ass PE firms refocusing on AI supply chains.

Only our pensions and futures burning.