Comment by AaronNewcomer
1 day ago
The thinking models (especially OpenAI's o3) still seem to do by far the best at this task as they look across the document to see how the writer wrote certain letters where the word is more clear when it runs into confusing words.
I built a whole product around this: https://DocumentTranscribe.com
But I imagine this will keep getting better and that excites me since this was largely built for my own research!
Your demo is very well done, love it!
I find Gemini 2.5 pro, not flash, way better than the chatGPT models. I didn't remember testing o3 though. Maybe it's o3 pro and it's one of the old costly and thinking models?