Comment by barrell

4 hours ago

I’m building a language learning app [https://phrasing.app] and this is really good advice. I’ve not had any interest in SST for the application, and have no plans to integrate it. In my experience, I’ve never seen them be truly beneficial in the language learning process.

What has been extremely beneficial has been having the text and audio forced aligned and highlighted, kareoke-style, every time I hear the audio. It has improved my phoneme recognition remarkably well with remarkably little content. Several users also report the same thing - that even native speech feels a lot more like separate words than just a slew of sounds. I attribute this in large part just due to this kareoke style audio. It works better for phonetic scripts, so I would recommend using this with pinyin/jyutping/furigana for character based languages.

For production, when I was at Regina Coeli (world-class language institute) their main thing was just 1. you hear a short passage in Dutch, 10-40 words 2. you record yourself reading the same passage and 3. you play back the two audio tracks on top of one another and listen for the difference. Optional step 4. Re-record and replay until it’s close enough.

There was no grading, no teacher checking recordings, no right or wrong; just hundreds of random sentences and a simple app to layer them. You needed to learn to hear the differences yourself and experiment until you no longer could. (fwiw this is not present in phrasing, I just found it relevant. One day soon I hope to add it!)