← Back to context

Comment by fluoridation

3 days ago

It's pretty stupid because as others in this thread have pointed out it's already not a flat plan. Even from their side it makes zero sense to bill things this way rather than based on usage. It's not like a VPS where your VM shares the hardware, which consumes electricity more or less regardless of what you use the machine for.

Those yottabytes of VRAM are also consuming electricity constantly.

  • The difference being that an LLM request is not an operating system. Since they're compartmentalized and ephemeral, you can very easily distribute requests among your available hardware so that you can switch off machines during periods of low activity.