Comment by arjie
9 hours ago
Not “local” and not interactive coding but sharing since it might be helpful. I have 2x RTX Pro 6000 Blackwell running DeepSeek V4 Flash. I get 160 tok/s raw but it’s a reasoning model. For my use case, I have it auto-write code and another system auto-review the code.
I occasionally use it with pi to write some code and it’s blazing fast but it’s mostly habit that keeps me with CC and Codex.
> I have 2x RTX Pro 6000 Blackwell
Where did you find/order these? All the sites I can find are either out of stock, only sell to businesses, or are otherwise sketchy...
Microcenter is the easiest place but almost any vendor will sell to you after you email them and if you have an LLC
I run a small business (https://technologybrother.com) that runs a few small SaaS so I ordered the GPUs through corporate sales. If the barrier is getting an LLC, those are relatively cheap. The nice thing is that if you've got a legitimate business with use for GPUs you can get into the Nvidia Inception Program which has a pretty solid discount.
Have you measured your electricity consumption for this rig? I have to wonder how much it would cost you per month.
Not nearly as much as you might think. 1.2kw where I live translates to about $0.12/hr, and that's when running full clip. If you have a decent solar hookup, it's small fraction on a sunny day.
The expensive part is the upfront hardware cost and the electrical system upgrade you'll need to give your house.
I'm paying about $0.19/hr and using half that power just for a large spinning RAID, running some VMs and security cameras. And I'm reconsidering my digital extravagance because of the electric bill. You probably make way more money than I do.
Here's a DeepSeek-V4-Flash benchmark on 2X RTX Pro 6000:
I've asked it to calculate the following considering a realistic blend of cached prompts and decode for agentic dev scenario.
Electricity-only (@ USD $0.08/kWh)
Total cost of ownership over 3 years is electricity + USD $20K (pre-hike pricing). In a production scenario, how much would I have to charge my users to break even, aiming for 4 concurrent requests 24/7?
A) Breakeven API pricing (est. 2B IN + 1B OUT throughput/month):
B) Breakeven subscription (users active ~1.5h/day):
Vouched your comment. Very cool. What are you running on to get 190 tok/s? I get 400 tok/s at c=4 but c=1 is slower than you.