Comment by Eridrus
3 hours ago
I disagree with on-prem being ideal for GPU for most people.
If you're doing regular inference for a product with very flat throughput requirements (and you're doing on-prem already), on-prem GPUs can make a lot of sense.
But if you're doing a lot of training, you have very bursty requirements. And the H100s are specifically for training.
If you can have your H100 fleet <38% utilized across time, you're losing money.
If you have batch throughput you can run on the H100s when you're not training, you're probably closer to being able to wanting on-prem.
But the other thing to keep in mind is that AWS is not the only provider. It is a particularly expensive provider, and you can buy capacity from other neoclouds if you are cost-sensitive.
No comments yet
Contribute on Hacker News ↗