Comment by loveparade
17 hours ago
For the long videos I just relied in ffmpeg to remove silence. It has lots of options for it, but you may need to fiddle with the parameters to make it work. I ended up with something like:
``` stream = ffmpeg.filter( stream, 'silenceremove', detection='rms', start_periods=1, start_duration=0, start_threshold='-40dB', stop_periods=-1, stop_duration=0.15, stop_threshold='-35dB', stop_silence=0.15 ) ```
This is absolutely gold, thank you for sharing the exact script!
That specific ffmpeg silenceremove filter is exactly the type of pre-processing step we were debating for handling those massive, lengthy live stream files before they hit the LLM. It's a huge performance bottleneck solver.
We figured ffmpeg would be the way to go, but having your tested parameters (especially the start/stop thresholds) for effective noise removal saves us a massive amount of internal testing time. That's true open-source community value right there.
This confirms that our batch pipeline needs three distinct automated steps:
URL/ID Harvesting (as discussed)
Audio Pre-Processing (using solutions like your ffmpeg setup)
LLM Transcription (for Pro users)
We will aim to make that audio cleaning step abstracted and automated for our users—they won't have to fiddle with parameters; they'll just get a cleaned transcript ready for analysis.
Thanks again for the technical deep dive! This is incredibly helpful for solidifying our architecture.