Comment by miroljub

9 hours ago

Just to clarify to people focusing on the $180/month price tag.

OpenClaw is not a CC-only product. You can configure it to use any API endpoint.

Paying $180/month to Anthropic is a personal choice, not a requirement to use OpenClaw.

So that leads to a question: Is there a physical box I could buy that an amortize over 5-7 years to be half the API cost?

In other words, assuming no price increase, 7 years of that pricing is $15k. Is there hardware I could buy for $7k or less that would be able to replace those API calls or alternativr subs entirely?

I've personally been trying to determine if I should buy a new GC on my aging desktop(s), since their graphic cards can't really handle LLMs)

  • You can't realistically replace a frontier coding model on any local hardware that costs less than a nice house, and even then it's not going to be quite as good.

    But if you don't need frontier coding abilities, there are several nice models that you can run on a video card with 24GB to 32GB of VRAM. (So a 5090 or a used 3090.) Try Gemma4 and Qwen3.5 with 4-bit quantization from Unsloth, and look at models in the 20B to 35B range. You can try before you buy if you drop $20 on OpenRouter. I have a setup like this that I built for $2500 last year, before things got expensive, and it's a nice little "home lab."

    If you want to go bigger than this, you're looking at an RTX 6000 card, or a Mac Studio with 128GB to 512GB of RAM. These are outside your budget. Or you could look at a Mac Minis, DGX Spark or Strix Halo. These let you bigger models much slower, mostly.

    • > or a Mac Studio with 128GB to 512GB of RAM. These are outside your budget.

      M3 ultra with 80GOu cores and 256GB of ram is $7500 - that’s right at the edge of the budget, but it fits.. if you can get an edu discount through a kid or friend you’re even better off!

    • Thanks. That is what I suspected. The 3090's in my area seem pretty expensive for a several year old second hand card - they are the same price as a new 5080.

      5090 is pretty expensive (~$4000) to justify it over a $10-50 sub. I guess the nice thing is the api side becomes "included", if I ever want to go that route. But if I have a GHCP $40 sub vs a $4000 GC to match it, just on hardware, pay off is at 8 years. If I add in electricity, pay off is probably never.

      Sure, the sub can go up in price, but the value proposition for self-running doesn't seem to make sense - especially if I can't at least match Sonnet on GHCP or something like that.

      I hope to self-run some not useless LLMs/Agents at some point, but I think this market needs to stabalize first. I just don't like waiting.

  • You can buy a roughly $40k gpu (the h100) which will cost $100/mo in electricity on top of that to get about 30-80% the performance of OpenAI or Anthropic frontier models, depending what you're doing.

    Over 5 years, that works out to ~$45k vs ~$10k, and during that duration, it's possible better open models will come available making the GPU better, but it's far more likely that the VC-fueled companies advance quicker (since that's been the trend so far).

    In other words, the local economics do not work out well at a personal scale at all unless you're _really_ maxing out the GPU at close to 50% literally 24/7, and you're okay accepting worse results.

    As long as proprietary models advance as quickly as they are, I think it makes no sense to try and run em locally. You could buy an H100, and suddenly a new model that's too large to run on it could be the state of the art, and suddenly the resale value plummets and it's useless compared to using this new model via APIs or via buying a new $90k GPU with twice the memory or whatever.

  • For something like OpenClaw you realistically only need rather slow inference, so use SSD offload as described by adrian_b here: https://news.ycombinator.com/item?id=47832249 Though I'm not sure that the support in the main inference frameworks (and even in the GGUF format itself, at least arguably) is up to the task just yet.

  • You can get quite good models running on a Mac Studio, but these will not rival a frontier model.

    $3,699.00

    M4 Max 16c/40c, 128GB of RAM, 1TB SSD.

    LM Studio is free and can act as a LLM server or as a chat interface, and it provides GUI management of your models and such. It's a nice easy and cheap setup.

  • You can use several times cheaper models than Claude as well, its not like you need anything big to handle all the uses cases listed above

    • Yeah, something like MiniMax m2.7 should be perfectly capable for this sort of thing, and is 10-20x cheaper

  • For something the size of Claude, probably not. But for smaller models, maybe (though they also are much cheaper to buy tokens for)