Comment by yvdriess
10 days ago
A good opportunity to point people to the paper with my favorite title of all time:
"How to wreck a nice beach you sing calm incense"
10 days ago
A good opportunity to point people to the paper with my favorite title of all time:
"How to wreck a nice beach you sing calm incense"
For folks like me puzzling over what the correct transcription of the title should be, I think it's "How to recognize speech using common sense"
Thank you! "Calm incense" makes very little sense when said in an accent where calm isn't pronounced like com.
How is calm pronounced in those accents?
7 replies →
This is the correct parsing of it. (I can't take credit for coming up with the title, but I worked on the project.)
I only got the "How to recognize" part. Also I think "using" should sound more like "you zinc" than "you sing".
Thanks. Now I know that I'm not that stupid and this actually makes no sense
It actually does make sense. Not saying you're stupid, but in standard English, if you say it quickly, the two sentences are nearly identical.
3 replies →
Thank you very much!
The paper: https://sci-hub.st/https://dl.acm.org/doi/10.1145/1040830.10...
(Agree that the title is awesome, by the way!)
Direct PDF download link:
https://web.media.mit.edu/~lieber/Publications/Wreck-a-Nice-...
Fun fact, I just could not work out what this was supposed to be, so I just used Whisper (indirectly, via the FUTO Voice Input app on my phone) and repeated the sentence into it, and it came out with the 'correct' transcription of "How to recognize speech using common sense." first time.
Of course, this is nothing like what I actually said, so... make your own mind up whether that is actually a correct transcription or not!
I have a British accent, for the record.
My favorite is:
"Threesomes, with and without blame"
https://dl.acm.org/doi/10.1145/1570506.1570511
(From a professor I worked with a bit in grad school)
Also relevant: The Two Ronnies - "Four Candles"
https://www.youtube.com/watch?v=gi_6SaqVQSw
Do AI voice recognition still use markov models for this?
Whisper uses an encoder-decoder transformer.