← Back to context

Comment by SilverSlash

3 days ago

The "heavy" model is $300/month. These prices seem to keep increasing while we were promised they'll keep decreasing. It feels like a lot of these companies do not have enough GPUs which is a problem Google likely does not have.

I can already use Gemini 2.5 Pro for free in AI studio. Crazier still, I can even set the thinking budget to a whopping 32k and still not pay a dime. Maybe Gemini 3.0 will be available for free as well.

Who promised that there would be no advanced models with high costs?

Prices for the same number of tokens at the level of capability an are falling. But just like Moore’s law most certainly did NOT say that chips would get no more complex than the 1103 1kb DRAM but would shrink from 10mm^2 to a speck far too small to see.

> These prices seem to keep increasing while we were promised they'll keep decreasing.

A Ferrari is more expensive than the model T.

The most expensive computer is a lot more expensive than the first PC.

The price that usually falls is:

* The entry level. * The same performance over time.

But the _price range_ gets wider. That's fine. That's a sign of maturity.

The only difference this time is that the entry level was artificially 0 (or very low) because of VC funding.

  • But where is the value?

    If it could write like George Will or Thomas Sowell or Fred Hayek or even William Loeb that would be one thing. But it hears dog whistles and barks which makes it a dog. Except a real dog is soft and has a warm breath, knows your scent, is genuinely happy when you come home and will take a chomp out of the leg of anyone who invades your home at night.

    We are also getting this kind of discussion

    https://news.ycombinator.com/item?id=44502981

    where Grok exhibited the kind of behavior that puts "degenerate" in "degenerate behavior". Why do people expect anything more? Ten years ago you could be a conservative with a conscience -- now if you are you start The Bulwark.

    • > If it could write like George Will or Thomas Sowell or Fred Hayek or even William Loeb

      Having only barely heard of these authors even in the collective, I bet most models could do a better job of mimicking their style than I could. Perhaps not well enough to be of interest to you, and I will absolutely agree that LLMs are "low intelligence" in the sense that they need far more examples than any organic life does, but many of them will have had those examples and I definitely have not.

      > We are also getting this kind of discussion

      > https://news.ycombinator.com/item?id=44502981

      Even just a few years ago, people were acting as if a "smart" AI automatically meant a "moral AI".

      Unfortunately, these things can be both capable* and unpleasant.

      * which doesn't require them to be "properly intelligent"

      20 replies →

  • > The most expensive computer is a lot more expensive than the first PC.

    Not if you're only looking at modern PCs (and adjusting for inflation). It seems unfair to compare a computer built for a data center with tens of thousands in GPUs to a PC from back then as opposed to a mainframe.

    • Good point; the proper comparison might be between something like ENIAC, which reportedly cost $487K to build in 1946, being about$7M now, and a typical Google data center, reportedly costing about $500M.

      1 reply →

  • > The most expensive computer is a lot more expensive than the first PC.

    Depends on your definition of "computer". If you mean the most expensive modern PC I think you're way off. From https://en.wikipedia.org/wiki/Xerox_Alto: "The Xerox Alto [...] is considered one of the first workstations or personal computers", "Introductory price US$32,000 (equivalent to $139,000 in 2024)".

  • The base model Apple II cost ~$1300USD when it was released; that's ~$7000USD today inflation adjusted.

    In other words, Apple sells one base-model computer today that is more expensive than the Apple II; the Mac Pro. They sell a dozen other computers that are significantly cheaper.

    • We're trying to compare to the 80's where tech was getting cheaper. Instead of 2010 where tech was nearly given away and then squeezed out of us.

      We're already at the mac Mini prices. It's a matter of if the eventual baseline will be macbook air or a fully kitted out mac pro. There will be "cheap"options, but they won't be from this metaphorical Apple.

  • That was the most predictable outcome. It's like we learned nothing from Netflix, nor the general enshittification of tech by the end of the 2010's. We'll have the billionaire AI tech capture markets and charge enterprise prices to make pay back investors. Then maybe we'll have a few free/cheap models fighting over the scraps.

    Those small creators hoping to leverage AI to bring their visions to life for less than their grocery bill will have a rude awakening. That's why I never liked the argument of "but it saves me money on hiring real people".

    I heard some small chinese shops for mobile games were already having this problem in recent years and had to re-hire their human labor back when costs started rising.

It's important to note that pricing for Gemini has been increasing too.

https://news.ycombinator.com/item?id=44457371

  • I'm honestly impressed that the sutro team could write a whole post complaining about Flash, and not once mention that Flash was actually 2 different models, and even go further to compare the price of Flash non-thinking to Flash Thinking. The team is either scarily incompetent, or purposely misleading.

    Google replaced flash non-thinking with Flash-lite. It rebalanced the cost of flash thinking.

  • Also important to note that Gemini has gotten a lot slower, just over the past few weeks.

    • I find Gemini basically unusable for coding for that reason.

      Claude never fails me

It’s the inference time scaling - this is going to create a whole new level of have vs have nots split.

The vast majority of the world can’t afford 100s of dollars a month

Why number of GPUs is the problem and not the amount of GPUs usage? I don't think buying GPUs is the problem, but if you have tons of GPUs it can be very expensive. I presume that's the reason it's so expensive, especially with LLMs.

also their api pricing is a little misleading - it only matches sonnet 4 pricing ($3/$15) only "for request under 128k" (whatever it means) but above that it's 2x more.

  • That 128k is a reference to the context window — how many tokens you put in to the start. Presumably Grok 4 with 128k context window is running on less hardware (it needs much less RAM than 256k) and they route it accordingly internally.

> These prices seem to keep increasing while we were promised they'll keep decreasin

I don't remeber anyone promising that, but whoever promised you that, in some period of time which includes our current present, frontier public model pricing would be monotonically decreasing was either lting or badly misguided. While there will be short term deviations, the overall arc for that will continue be upward.

OTOH, the models available at any given price point will also radically improve, to the point where you can follow a curve of both increasing quality and decreasing price, so long as you don't want a model at the quality frontier.

> These prices seem to keep increasing while we were promised they'll keep decreasing.

Aren't they all stil losing money, regardless?

O3 was just reduced in price by 80%. Grok4 is a pretty good deal for having just been released and being so much better. The token price is the same as grok 3 for the not heavy model. Google is loosing money to try and gain relevance. I guess i’m not sure what your point is?

It's because a lot of the advancements are post training the models themselves have stagnated. Look at the heavy "model"...

You have to have a high RRP to negotiate any volume deals down from.

Like the other AI companies, they will want to sign up companies.

> These prices seem to keep increasing

Well, valuations keep increasing, they have to make the calculations work somehow.

> Gemini 2.5 Pro for free ...

It is Google. So, I'd pay attention to data collection feeding back in to training or evaluation.

https://news.ycombinator.com/item?id=44379036

  • While Google is so explicit about that, I have a good reason to believe that this actually happens in most if not all massive LLM services. I think Google's free offerings are more about vendor lock-in, a common Google tactic.

    • What makes you say Google is explicit about the fact they have humans and AIs reading everything? It’s got a confusing multi-layer hierarchy of different privacy policies which hide what’s happening to folks’ conversations behind vague language. They promote it as being free but don’t even link to the privacy policies when they launch stuff, effectively trying to bait noobs into pasting in confidential information

      1 reply →