Comment by p1esk

12 hours ago

Do you think a lot of people have “systems” to run a 1.6T model?

14 comments

p1esk

To me, the important thing isn't that I can run it, it's that I can pay someone else to run it. I'm finding Opus 4.7 seems to be weirdly broken compared to 4.6, it just doesn't understand my code, breaks it whenever I ask it to do anything.

Now, at the moment, i can still use 4.6 but eventually Anthropic are going to remove it, and when it's gone it will be gone forever. I'm planning on trying Deepseek v4, because even if it's not quite as good, I know that it will be available forever, I'll always be able to find someone to run it.

muyuu 4 hours ago

Yep, it's wild how little emphasis is there on control and replicability in these posts.
Already these models are useful for a myriad of use cases. It's really not that important if a model can 1-shot a particular problem or draw a cuter pelican on a bike. Past a degree of quality, process and reliability are so much more important for anything other than complete hands-off usage, which in business it's not something you're really going to do.
The fact that my tool may be gone tomorrow, and this actually has happened before, with no guarantees of a proper substitute... that's a lot more of a concern than a point extra in some benchmark.

applfanboysbgon 11 hours ago

No, but businesses do. Being able to run quality LLMs without your business, or business's private information, being held at the mercy of another corp has a lot of value.

forrestthewoods 11 hours ago
What type of system is needed to self host this? How much would it cost?
- disiplus 11 hours ago
  
  Depends how many users you have and what is "production grade" for you but like 500k gets you a 8x B200 machine.
- p1esk 11 hours ago
  
  Depends on fast you want it to be. I’m guessing a couple of $10k mac studio boxes could run it, but probably not fast enough to enjoy using it.
- fragmede 10 hours ago
  
  One GB200 NVL72 from Nvidia would do it. $2-3 million, or so. If you're a corporation, say Walmart or PayPal, that's not out of the question.
  If you want to go budget corporate, 7 x H200 is just barely going to run it, but all in, $300k ought to do it.
  
  2 replies →
- CamperBob2 9 hours ago
  
  $20K worth of RTX 6000 Blackwell cards should let you run the Flash version of the model.
choldstare 11 hours ago
Not really - on prem llm hosting is extremely labor and capital intensive
- applfanboysbgon 11 hours ago
  
  But can be, and is, done. I work for a bootstrapped startup that hosts a DeepSeek v3 retrain on our own GPUs. We are highly profitable. We're certainly not the only ones in the space, as I'm personally aware of several other startups hosting their own GLM or DeepSeek models.
  
  1 reply →