← Back to context

Comment by vessenes

15 hours ago

We are nearly infinitely far away from saturating compute demand for inference.

Case in point; I'd like something that realtime assesses all the sensors and API endpoints of stuff in my home and as needed bubbles up summaries, diaries, and emergency alerts. Right now that's probably a single H200, and well out of my "value range". The number of people in the world that do this now at scale is almost certainly less than 50k.

If that inference cost went to 1%, then a) I'd be willing to pay it, and b) there'd be enough of a market that a company could make money integrating a bunch of tech into a simple deployable stack, and therefore c) a lot more people would want it, likely enough to drive more than 50k H200s worth of inference demand.

Do you really need a H200 for this? Seems like something a consumer GPU could do. Smaller models might be ideal [0] as they don't require extensive world knowledge and are much more cost efficient/faster.

Why can't you build this today?

[0]: https://arxiv.org/pdf/2506.02153 Small Language Models are the Future of Agentic AI (Nvidia)

Is all of that not achievable today with things like Google Home?

It doesn’t sound like you need to run a H200 to bridge the gap between what currently exists and the outcome you want.

Sure but if that inference cost went to 1%, then Oracle and Nvidia's business model would be bust. So you agree with me?

absolutely nobody wants or needs a fucking thermostat diary lmao, and the few ppl that do will have zero noticeable impact on world's compute demands, i'm begging ppl in on hn to touch grass or speak to an average person every now and then lol

  • its pretty easy to dispute and dismiss a single use case for indiscriminate/excessive use of inference to achieve some goal, as you have done here, but its hard to dispute every possible use case

  • You wouldn't even know that it existed, or how it worked. It would just work. Everybody wants hands off control that they don't have to think or learn about.

    edit: this reminds me of a state agency I once worked for who fired their only IT guy after they moved offices, because the servers were running just fine without him. It was a Kafkaesque trauma for him for a moment, but a massive raise a week later when they were renegotiating for him to come back.