Comment by _micah_h
2 years ago
Check out the graphs over time on the model pages - https://artificialanalysis.ai/models/gpt-4-turbo-1106-previe....
OpenAI are doing a ton of load balancing, presumably constantly tweaking batch sizes to try to optmize across all their workloads.
You can test the GPT-4 vs GPT-4 Turbo on Playground to intuitively confirm that the speeds are similar.
No comments yet
Contribute on Hacker News ↗