← Back to context

Comment by realusername

1 year ago

It's cheaper because you are unlikely to run your local AI at top capacity 24/7 so you have unused capacity which you are paying for.

They are specifically referring to usage of APIs where you just pay by the token, not by compute. In this case, you aren’t paying for capacity at all, just usage.