Comment by AaronNewcomer

1 day ago

The thinking models (especially OpenAI's o3) still seem to do by far the best at this task as they look across the document to see how the writer wrote certain letters where the word is more clear when it runs into confusing words.

I built a whole product around this: https://DocumentTranscribe.com

But I imagine this will keep getting better and that excites me since this was largely built for my own research!

I find Gemini 2.5 pro, not flash, way better than the chatGPT models. I didn't remember testing o3 though. Maybe it's o3 pro and it's one of the old costly and thinking models?