← Back to context Comment by wongarsu 23 days ago Which conveniently fits on one 8xH100 machine. With 100-200 GB left over for overhead, kv-cache, etc. 2 comments wongarsu Reply storystarling 23 days ago The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that. kristianp 22 days ago Yes, it only makes sense economically if you have batching over many users.
storystarling 23 days ago The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that. kristianp 22 days ago Yes, it only makes sense economically if you have batching over many users.
The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that.
Yes, it only makes sense economically if you have batching over many users.