← Back to context

Comment by hagen8

10 hours ago

Well, the question is what is contributing to the usage. Because as the context grows, the amount of input tokens are increasing. A model call with 800K token as input is 8 times more expensive than a model call with 100K tokens as input. Especially if we resume a conversation and caching does not hit, it would be very expensive with API pricing.