Comment by KeplerBoy
7 hours ago
It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.
7 hours ago
It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.
Yeah in my last job, unsupervised dataset-scale transformations amounted to 97% of all spending. We were using gemini 2.5 flash in batch/prefill-caching mode in Vertex, and always the latest/brightest for ChatGPT-like conversations.