Comment by philipodonnell

8 months ago

Isn’t this an arbitrage opportunity? Offer to pay a fraction of the cost per token but accept that your tokens will only be processed when the batch window isn’t big enough, then resell that for a markup to people who need non-time sensitive inference?

1 comment

philipodonnell

pama 8 months ago

You may have already noticed that many providers have separate, much lower, prices for offline inference.