Comment by websiteapi
1 month ago
I wonder if it works for speaker diarization out of the box. I've found that open source speaker diarization that doesn't require a lot of tweaking is basically non-existent.
1 month ago
I wonder if it works for speaker diarization out of the box. I've found that open source speaker diarization that doesn't require a lot of tweaking is basically non-existent.
Yeah I was frustrated by slow and hard to use OSS diarization too; recently released a library to address that, check it out: https://github.com/narcotic-sh/senko
Also https://zanshin.sh, if you'd like speaker diarization when watching YouTube videos
Hey, thanks for this. Been trying it out and it's very fast but seems to hear more speakers than are in the audio. I didn't see a way to tweak speaker similarity settings or merge speakers in some way. Any advice?
Thanks for checking it out!
Yeah unfortunately, since the diarization is acoustic features based, it really does require high recorded voice fidelity/quality to get the best results. However, I just added another knob to the Diarizer class called mer_cos, which controls the speaker merging threshold. The default is 0.875, so perhaps try lowering to 0.8. That should help.
I'll also get around to adding a oracle/min/max speakers feature at some point, for cases where you know the exact number of speakers ahead of time, or wanna set upper/lower bounds. Gotten busy with another project, so haven't done it yet. PR's welcome though! haha
2 replies →
looks interesting. will check it out.