Comment by narrator

2 days ago

I think what most people don't realize is running an agent 24/7 fully automated is burning a huge hole in their profitability. Who even knows how big it is. It could be getting it on the 8/9 figures a day for all we know.

There's this pervasive idea left over from the pre-llm days that compute is free. You want to rent your own H200x8 to run your Claude model, that's literally going to cost $24/hour. People are just not thinking like that. I have my home PC, it does this stuff I can run it 24/7 for free.

16 comments

narrator

starkgoose 2 days ago

there are usage limits preventing you from running it 24/7 on all subscriptions tiers

butlike 2 days ago

This bolster's OP's point, to an extent

Xunjin 2 days ago

I understand you mean for free in the sense that you don't pay a third party to use it, however let's no forget that you still use the power grid and that's not free. Also worth to note that energy prices have increased worldwide.

dspillett 2 days ago

Depending on utilisation and good use of low-power or sleep (or full off) states when things aren't actively processing, it can still be a _lot_ cheaper to run things at home than on a rented service. Power costs have increased a lot in recent years, but so have compute-per-watt ratios and you are not paying the that indirect compute price when the processors are asleep or off whereas with subscription access to LLMs you are paying at least the base subscription each month even if you don't use it at all in that period. Much the same as the choice between self-hosting an open-source project or paying for a hosted instance - and in both cases people don't tend to consider the admin cost (for some of us the admin is “play time”!) so the self-hosted option it does practically feel free.
regenschutz 2 days ago
Maintaining hardware also isn't free. Time is money.
- darkwater 2 days ago
  
  Time is money if you have another good use of that time. If you like spending that time doing something, then it's literally free.

mickeyp 2 days ago

Sure it's $24/hour, but it'll crank through tens of thousands of tokens per second --- those beefy GPUs are meant for large amounts of parallel workflow. You'll never _get_ that many tokens for a single request. That's why the mathematics work when you get dozens or hundreds of people using it.

No. The sauce is in KV caching: when to evict, when to keep, how to pre-empt an active agent loop vs someone who are showing signs of inactivity at their pc, etc.

admx8 2 days ago

Coder doing the coding should use subscription, and now they ban the choice of your preferred ide for agentc coding. API is for automation not coding. I'm going to cancel their subscription today, I already use codex with opencode.

notpushkin 2 days ago

This is honestly the key difference here. I’m morally okay with using Claude Max Whatever with something like OpenCode because it’s literally the same thing from the usage pattern perspective. Plugging Nanoclaw into it is a whole another thing.

ethbr1 2 days ago
It probably doesn't help that the creator of OpenClaw just got hired by Anthropic's competitor.
This sounds like engineering, finance, and legal got together and decided they were in an untenable position if OpenAI started nudging OpenClaw to burn even more tokens on Anthropic (or just never optimize) + continually updated workarounds to using subscription auth. But I'm sure OpenAI would never do something like that...
At the end of the day, it's the same 'fixed price plan for variable use on a constrained resource' cellular problem: profitability becomes directly linked to actual average usage.
- skeledrew 2 days ago
  
  > OpenAI started nudging OpenClaw to burn even more tokens on Anthropic
  Not possible: OpenClaw is run by a foundation, and is open source, which means OpenAI has no leverage to do such a thing.
  
  5 replies →