← Back to context

Comment by geeky4qwerty

17 hours ago

I'm afraid the music may be slowly fading at this party, and the lights will soon be turned on. We may very well look back on the last couple years as the golden era of subsidized GenAI compute.

For those not in the Google Gemini/Antigravity sphere, over the last month or so that community has been experiencing nothing short of contempt from Google when attempting to address an apparent bait and switch on quota expectations for their pro and ultra customers (myself included). [1]

While I continue to pay for my Google Pro subscription, probably out of some Stockholm Syndrome, beaten wife level loyalty and false hope that it is just a bug and not Google being Google and self-immolating a good product, I have since moved to Kiro for my IDE and Codex for my CLI and am as happy as clam with this new setup.

[1] https://github.com/google-gemini/gemini-cli/issues/24937

For what it’s worth, that was pretty obvious from the get go it wasn’t a realistic long term deal. I’ve been building all the libraries I hoped existed over the past 1-2y to have something neat to work with whenever the free compute era ends. I feel that’s the approach that makes sense. Take the free tokens, build everything you would want to exist if you don’t have access to the service anymore. If it goes away you’re back to enjoying writing code by hand but with all the building blocks you dreamt of. If it never goes away, nothing wasted, you still have cool libs

Maybe I missed the party, but it feels like it's just starting.

I have only been running local models and we are finally at the point with gemma4 and Qwen3.5 where they can start doing coding work.

And the quota can't change.

  • The only viable future-proof solution to this hellscape is what you mention, local models and/or corporate models for work.

So, antigravity will definitely quickly eat up your pro quota. You can run out of it in an hour (at least on the $20/mo plan) and then you'll be waiting five days for it to refresh.

However, I've found that the flash quota is much more generous. I have been building a trio drive FOC system for the STM32G474 and basically prompting my way through the process. I have yet to be able to run completely out of flash quota in a given five hour time window. It is definitely completing the work a lot faster than I could do myself -- mainly due to its patience with trying different things to get to the bottom of problems. It's not perfect but it's pretty good. You do often have to pop back in and clean up debris left from debugging or attempts that went nowhere, or prompt the AI to do so, but that's a lot easier than figuring things out in the first place as long as you keep up with it.

I say this as someone who was really skeptical of AI coding until fairly recently. A friend gave me a tutorial last weekend, basically pointing out that you need to instruct the AI to test everything. Getting hardware-in-loop unit tests up and running was a big turning point for productivity on this project. I also self-wired a bunch of the peripherals on my dev board so that the unit tests could pretend to be connected to real external devices.

I think it helps a lot that I've been programming for the last twenty years, so I can sometimes jump in when it looks like the AI is spinning its wheels. But anyway, that's my experience. I'm just using flash and plan mode for everything and not running out of the $20/mo quota, probably getting things done 3x as fast as I could if I were writing everything myself.

> I'm afraid the music may be slowly fading at this party, and the lights will soon be turned on. We may very well look back on the last couple years as the golden era of subsidized GenAI compute.

Indeed. Anthropic is just leading the pack switching to juicy corporate users who are happy to pay thousands per month per dev and leave the fans behind. And now OpenAI is following suit. They lowered significantly the limits for the Plus $20 plan and answered concerns with vague confusing tweets about promotions.

All this is pushed by the fastest rising demand (Codex growing +50% monthly) while having a serious bottleneck building data centers and getting parts (permits, energy, memory, flash, etc).

Users on reddit and Discord are trying to switch to open models or Chinese alternatives. But there's no real replacement.

  • I don't know about users on reddit and discord, but the open models are essentially at SotA with a 3-4 months delay. That puts a hard backstop at what OpenAI and Anthropic can do before I personally can cut them off entirely without losing too much.

    Granted the experience can be worse, esp. if you're using it very hands-off and not like a junior assistant who's extremely fast but doesn't know what he's doing at the architecture and strategy level. But even for that I'm relatively confident the Chinese will be competitive pretty soon, and they won't be too expensive. And we know this because we can see their current models and we know what it takes to run them.

    Currently my Strix Halo computer that costed me under £3k can do a lot of LLM stuff that is perfectly useful. In some ways, it's better than "cloud" models, I have models that essentially don't say "no" and I have relatively predictable setups. If you want to get fancy, you can right now rent compute to run models that are extremely capable like the latest ones from Kimi, GLM, Qwen, Minimax at full size from providers that are not operating at a loss and it won't be too expensive. You can pool resources to do the same locally. You can do stuff that cloud providers are unlikely to market, like distillation and abliteration to serve your specific needs.

    I'm very optimistic about open weights models just the way they are right now.

    But I agree with you that OpenAI will likely play similar games to Anthropic and it could be soon.

Lights on = Ads in your output. EOY latest; they can't keep kicking the massive costs down the road.

  • Where is your evidence of this "massive cost"? Inference is massively profitable for both anthropic and openai. Training is not.

    • The evidence is that quotas exist, as seen here, and are low enough that people are hitting them regularly. When was the last time you hit your quota of Google searches? When was the last time you hit your quota of StackOverflow questions? When was the last time you hit your quota of YouTube videos? Any service will rate limit abuse, but if abuse is indistinguishable from regular use from the provider's perspective, that's not a good sign.

      4 replies →

    • You're assuming they can just stop training. For the entirety of these companies' existence, they have done training. It is part of their price. They must keep pushing out better and better models. That's like saying Nvidia can just stop making new GPUs, they're obviously making so much money with their current models now.

    • The majority of accounts are free - these are profitable?

      IMO they need as many users before their IPO - then the changes will really begin.

    • Inference for API or subscriptions? There is a massive price difference between the two.

  • Ads do not pay enough to cover AI usage. People see the big numbers Google and Facebook make in ads and forget to divide the number by the number of people they serve ads to, let alone the number of ads they served to get to that per-user number. You can't pay for 3 cents of inference with .07 cents of revenue.

    You also can't put ads in code completion AIs because the instant you do the utility to me of them at work drops to negative. Guess how much money companies are going to pay for negative-value AIs? Let's just say it won't exactly pay for the AI bubble. A code agent AI puts an ad for, well, anything and the AI accidentally puts it into code that gets served out to a customer and someone's going to sue. The merits of the case won't matter, nor the fact the customer "should have caught it in review", the lawsuit and public reputation hit (how many people here are reading this and salivating at the thought of being able to post an angrygram about AIs being nothing but ad machines?) still cost way too much for the AI companies creating the agents to risk.

    • Agreed, and the answer is pretty obvious as to how they start making profit. The answer is in this thread, CRANKING the cost up immensely once they establish agreements between the duopoly leaders in the field to do so in tandem and buy up any competition that seeks to challenge them.

      I’m thinking 20x what the cost is now is where they’ll land. It’ll be a massive line item for software dev shops.

      1 reply →

IMO we are currently in the ENIAC era of LLMs. Perhaps there will be a brief moment where things get worse, but long term the cost of these things will go way down.

  • Cost will probably go down but nobody knows when or how. It might take 10 years for all we know as training costs have only been rising.

    A huge difference is early computers were not subsidized. It took decades until most people could afford to own a computer at home.

  • Or we are in the early Netflix era where profit wasn’t as important as customer growth.

Ultimately we'll find more efficient techniques and hardware and AI companies will end up owning Nuclear Power Stations and continue providing models capable of 10x of what they are now.

Valuation have already reached point where these companies can run their nuclear power station, fund developement of new hardware and techniques and boost capabilities of their models by 10x

  • There's not enough nuclear to go around, and the approval/permitting process for new nuclear power plants is nothing to sneeze at, both in terms of time and cost.

    That's also ignoring that nuclear power plants also consume quite a bit of water, which may be a more difficult bottleneck in and of itself even without trying to add nuclear into the mix.

  • Too bad the models collapse because the lack of nee good training data.

    How many companies will generate profit in the end, what will happen with all those power stations and data centers ?

Fellow annoyed Google AI Pro subscriber here!

Can confirm, I initially enjoyed the 5-hour limits on Gemini CLI and Antigravity so much that I paid for a full year, thinking it was a great decision

In the following months, they significantly cut the 5-hour limits (not sure if it even exists anymore), introduced the unrealistically bad weekly limit that I can fully consume in 1-2 hour, introduced the monthly AI credits system, and added ads to upgrade to Ultra everywhere

At the very least the Gemini mobile app / web app is still kinda useful for project planning and day-to-day use I guess. They also bumped the storage from 2TB to 5TB, but I don't even use that

  • It should be illegal to change the terms of the subscription mid-period. If you paid for the full year, you should get that plan for the whole year. I don't understand how it's ok for corporations to just change the terms mid-way, and we just have to accept it.

    • > It should be illegal to change the terms of the subscription mid-period

      Unfortunately, at least for those of us in the US, there isn't legally much that can be done. It's simply not possible to make a contract that would obligate a company to fulfill its promises on this type of sale.

  • It's the exact same thing they did with Google BigQuery, which initially was an absolutely amazing piece of technology before they smothered it with more and more limits and restrictions. It's like they're putting SREs first, customers second.

  • Don't bother upgrading to ultra. It's also now easy to burn all your credits where in Jan it was almost impossible

> We may very well look back on the last couple years as the golden era of subsidized GenAI compute.

Looks like enshittification on steroids, honestly.

  • Getting $5000 worth of product essentially free and then being told to pay is not enshittification.

    • Another take: perhaps they shouldn't have been pricing it at that point if they weren't capable of actually delivering.

    • The cost for AI companies might be $5000 but the "essentially free" could be close to the limit of what people are willing to spend. If that's the case then enshittification will continue and/or many AI companies will never be profitable.

    • We have seen this before. Companies using VC money to take over the market and then increase prices. In the end, we're worse off without these scumbags but some will still sing that we got free service do it's bot enshitification.