Comment by narrator
2 days ago
I agree. I have found GPT-5 significantly worse on medical queries. It feels like it skips important details and is much worse than o3, IMHO. I have heard good things about GPT-5 Pro, but that's not cheap.
I wonder if part of the degraded performance is where they think you're going into a dangerous area and they get more and more vague, for example like they demoed on launch day with the fireworks example. It gets very vague when talking about non-abusable prescription drugs for example. I wonder if that sort of nerfing gradient is affecting medical queries.
After seeing some painfully bad results, I'm currently using Grok4 for medical queries with a lot of success.
Interesting, it seems the anecdotal experience agrees with the benchmark results.
Afaik, there is currently no "GPT-5 Pro". Did you mean o3-pro or o1-pro (via API)?
Currently, GPT-5 sits at $10/1M output tokens, o3-pro at $80, and o1-pro at a whopping $600: https://platform.openai.com/docs/pricing
Of course this is not indicative of actual performance or quality per $ spent, but according to my own testing, their performance does seem to scale in line with their cost.
GPT-5 Pro is only available on ChatGPT with a ChatGPT Pro subscription.
Supposedly it fires off multiple parallel thinking chains and then essentially debates with itself to net a final answer.
O5-pro is available through the ChatGPT UI with a “Pro” plan. I understand that like o3 pro it is a high compute large context invocation of underlying models.
Thanks, I was not aware! I thought they offered all their models via their API.