Comment by anupj
1 day ago
Batch Mode for the Gemini API feels like Google’s way of asking, “What if we made AI more affordable and slower, but at massive scale?” Now you can process 10,000 prompts like “Summarize each customer review in one line” for half the cost, provided you’re willing to wait until tomorrow for the results.
> Now you can process 10,000 prompts like “Summarize each customer review in one line” for half the cost, provided you’re willing to wait until tomorrow for the results.
Sounds like a great option to have available? Not every task I use LLMs for need immediate responses, and if I wasn't using local models for those things, getting a 50% discount and having to wait a day sounds like a fine tradeoff.
This is an extremely common use case.
Reading your comment history: are you an LLM?
https://news.ycombinator.com/item?id=44531868
I don't understand the point you're making. This has been a commonly used offering since cloud blew up.
https://aws.amazon.com/ec2/spot/
Most LLM providers have batch mode. Not sure why you are calling them out.
I'll take it further. Regular cloud compute have batch workload capabilities at cheaper rates, as well since forever.