Comment by storystarling
3 hours ago
That 19 EUR figure is basically subscription arbitrage. If you ran that volume through the API with xhigh reasoning the cost would be significantly higher. It doesn't seem scalable for non-interactive agents unless you can stay on the flat-rate consumer plan.
Yeah, no way I'd do this if I paid per token. Next experiment will probably be local-only together with GPT-OSS-120b which according to my own benchmarks seems to still be the strongest local model I can run myself. It'll be even cheaper then (as long as we don't count the money it took to acquire the hardware).