Comment by hnfong

5 months ago

Obviously all the LLM API providers have a rate limit. Not a fan of GP's sarcastic tone, but I suppose many of us would like to know roughly what that limit would be for a small business using such APIs.

2 comments

hnfong

jdietrich 5 months ago

The rate limits for Gemini 1.5 Flash are 2000 requests per minute and 4 million tokens per minute. Higher limits are available on request.

https://ai.google.dev/pricing#1_5flash

4o-mini's rate limits scale based on your account history, from 500RPM/200,000TPM to 30,000RPM/150,000,000TPM.

https://platform.openai.com/docs/guides/rate-limits

simonw 5 months ago

Surprisingly, DeepSeek doesn't have a rate limit: https://api-docs.deepseek.com/quick_start/rate_limit

I've heard from people running 100+ prompts in parallel against it.