Comment by hiAndrewQuinn

6 days ago

Definitely not. I took this same basic idea of feeding videos into Whisper to get SRT subtitles and took it a step further to make automatic Anki flashcards for listening practice in foreign languages [1]. I literally feel like I'm living in the future every time I run across one of those cards from whatever silly Finnish video I found on YouTube pops up in my queue.

These models have made it possible to robustly practice all 4 quadrants of language learning for most common languages using nothing but a computer, not just passive reading. Whisper is directly responsible for 2 of those quadrants, listening and speaking. LLMs are responsible for writing [2]. We absolutely live in the future.

[1]: https://github.com/hiandrewquinn/audio2anki

[2]: https://hiandrewquinn.github.io/til-site/posts/llm-tutored-w...

Hi Andrew, I've been trying to get a similar audio language support app hacked together in a podcast player format (I started with Anytime Player) using some of the same principles in your project (transcript generation, chunking, level & obscurity aware timestamped hints and translations).

I really think support for native content is the ideal way to learn for someone like me, especially with listening.

Thanks for posting and good luck.