Comment by coder543

3 months ago

Terrible relative to everything else that exists today. I have a neutral American accent.

Maybe you just don’t know what you’re missing? Google’s default speech to text is still bad compared to Whisper and Parakeet, but even Google’s is markedly better than Apple’s.

I cannot think of a single speech to text system that I’ve run into in the past 5 years that is less accurate than the one Apple ships.

Sure, Apple’s speech to text is incredible compared to what was on the flip phone I had 20 years ago. Terrible is relative. Much better options exist today, and they’re under very permissive licenses. Apple’s refusal to offer a better, more accessible experience to their users is frustrating when they wouldn’t even have to pay a licensing fee to ship something better. Whisper was released under a permissive license nearly 4 years ago.

Apple also restricts third party keyboards to an absurdly tiny amount of memory, so it isn’t even possible to ship a third party keyboard that provides more accurate on-device speech to text without janky workarounds (requiring the user to open the keyboard's own app first each time).

As someone who tried every TTS in existance a few years ago for some product work, Apple’s is so consistantly better that we wound up getting a bunch of apple stuff just for the TTS.

  • “A few years ago” sounds like it could be before the modern era of STT, as defined by when Whisper was released.

    Your comment says TTS, which is different from what I’m discussing, though, so there might be some confusion.

> I have a neutral American accent

This is tangential but is _any_ accent objectively neutral?

  • Neutral here means not strongly identifiable as any particular regional American accent. Some people have very strong regional accents, some don’t. It is still clearly an American accent, not British or anything else.

    • TIL (assuming you mean American as in the United States of America) that this is known as a General American accent.