← Back to context Comment by wongarsu 1 month ago Which conveniently fits on one 8xH100 machine. With 100-200 GB left over for overhead, kv-cache, etc. 2 comments wongarsu Reply storystarling 1 month ago The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that. kristianp 1 month ago Yes, it only makes sense economically if you have batching over many users.
storystarling 1 month ago The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that. kristianp 1 month ago Yes, it only makes sense economically if you have batching over many users.
The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that.
Yes, it only makes sense economically if you have batching over many users.