Comment by rvnx

13 hours ago

The main thing about Openrouter is also that they take 100% of the risk in case of overcharges from the models, you have an actual hard cap.

The minus is that context caching is only moderately working at best, rendering all savings nearly useless.

I haven't noticed any problems with large context requests through OR to e.g. Opus (other than the rate at which my budget gets spent!). Is this a performance thing?

Is there any risk? Don't the model providers also bill by the token?

  • The accounting could be asynchronous, so you could overshoot your budget by a few requests before you're blocked.