Comment by storystarling
12 days ago
The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that.
12 days ago
The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that.
Yes, it only makes sense economically if you have batching over many users.