Comment by pants2

6 days ago

That leaderboard omits the current SOTA which is GPT-4o-transcribe (an LLM)

Do you have any comparisons in terms of WER? I doubt that GPT-4o-transcribe is better than the best models from that leaderboard (https://huggingface.co/spaces/hf-audio/open_asr_leaderboard). A quick search on this got me here: https://www.reddit.com/r/OpenAI/comments/1jvdqty/gpt4otransc... https://scribewave.com/blog/openai-launches-gpt-4o-transcrib...

It is stated that GPT-4o-transcribe is better than Whisper-large. That might be true, but what version of Whisper-large actually exactly? Looking at the leaderboard, there are a lot of Whisper variants. But anyway, the best Whisper variant, CrisperWhisper, is currently only at rank 5. (I assume GPT-4o-transcribe was not compared to that but to some other Whisper model.)

It is stated that Scribe v1 from elevenlabs is better than GPT-4o-transcribe. In the leaderboard, Scribe v1 is also only at rank 6.