If set, the transcription output will be sent to the specified file or URL
(use one of the FFmpeg AVIO protocols); otherwise, the output will be logged as info messages.
The output will also be set in the "lavfi.whisper.text" frame metadata.
If the destination is a file and it already exists, it will be overwritten.
@item format
The destination format string; it could be "text" (only the transcribed text will be sent to the destination), "srt" (subtitle format) or "json".
Default value: @code{"text"}
I don't know if this can embed the subtitles, but it does support generating accompanying srt files.
Of course, you could already do that by just manually calling whisper on files, but now you don't need to export parts or transformed media files to feed into whisper.
In my experience, a small/tiny whisper model has pretty okay English decoding speed on something relatively modern even without GPU support. There's a bunch of latency in the process (because of technological limitations) but the optimised C++ version shouldn't pose too much of a problem unless you're running in power saving mode. Battery life may be a problem on older laptops, though.
I've been waiting a while now for automatic translated subtitles in vlc. I thought it would be here by now. I'm probably underestimating the difficulty but I'm surprised some video player hasn't done it by now. (as far as I know).
A lot of subtitles from commercial media use a subtitle format that's essentially a bitmap that the video player overlays on top of the video. There are tools to decode this using OCR, but it's not something I'd enable by default.
For text/srt subtitles, translation would probably be easier. There's a plugin for that already if you're okay with online translation services: https://github.com/nopium/vlc-trans-lua
Docs say:
I don't know if this can embed the subtitles, but it does support generating accompanying srt files.
Of course, you could already do that by just manually calling whisper on files, but now you don't need to export parts or transformed media files to feed into whisper.
If you have enough processing power. Without a GPU it's going to lag.
In my experience, a small/tiny whisper model has pretty okay English decoding speed on something relatively modern even without GPU support. There's a bunch of latency in the process (because of technological limitations) but the optimised C++ version shouldn't pose too much of a problem unless you're running in power saving mode. Battery life may be a problem on older laptops, though.
Whisper is pretty fast.
Finally? I think VLC demo'd this a while ago at some conference where they had a table, if I remember correctly.
VLC and ffmpeg are unrelated projects
I'm not very familiar with them, but I always assumed that there is a lot of overlap between the maintainers of both projects.
2 replies →
I've been waiting a while now for automatic translated subtitles in vlc. I thought it would be here by now. I'm probably underestimating the difficulty but I'm surprised some video player hasn't done it by now. (as far as I know).
A lot of subtitles from commercial media use a subtitle format that's essentially a bitmap that the video player overlays on top of the video. There are tools to decode this using OCR, but it's not something I'd enable by default.
For text/srt subtitles, translation would probably be easier. There's a plugin for that already if you're okay with online translation services: https://github.com/nopium/vlc-trans-lua