Comment by hiAndrewQuinn
6 days ago
Definitely not. I took this same basic idea of feeding videos into Whisper to get SRT subtitles and took it a step further to make automatic Anki flashcards for listening practice in foreign languages [1]. I literally feel like I'm living in the future every time I run across one of those cards from whatever silly Finnish video I found on YouTube pops up in my queue.
These models have made it possible to robustly practice all 4 quadrants of language learning for most common languages using nothing but a computer, not just passive reading. Whisper is directly responsible for 2 of those quadrants, listening and speaking. LLMs are responsible for writing [2]. We absolutely live in the future.
[1]: https://github.com/hiandrewquinn/audio2anki
[2]: https://hiandrewquinn.github.io/til-site/posts/llm-tutored-w...
Hi Andrew, I've been trying to get a similar audio language support app hacked together in a podcast player format (I started with Anytime Player) using some of the same principles in your project (transcript generation, chunking, level & obscurity aware timestamped hints and translations).
I really think support for native content is the ideal way to learn for someone like me, especially with listening.
Thanks for posting and good luck.