Comment by geofffox

1 year ago

I use Firefox... still.

Hi, I built the client UI for this and... yea, I really wanted to get Firefox working :(

We needed a way to measure voice-to-voice latency from the end-user's perspective, and found Silero voice activity detection (https://github.com/snakers4/silero-vad) to be the most reliable at detecting when the user has stopped speaking, so we can start the timer (and stop it again when audio is received from the bot.)

Silero runs via onnx-runtime (with wasm). Whilst it sort-of-kinda works in Firefox, the VAD seems to misfire more than it should, causing the latency numbers to be somewhat absurd. I really want to get it working though! I'm still trying.

The code for the UI VAD is here: https://github.com/pipecat-ai/web-client-ui/tree/main/src/va...

  • Do you know why there's a difference in the performance of the algorithm in another browser? I would expect that all browsers run the code exactly the same way.

Do not go by the warning message. It does work just fine on Firefox latest. Cool, demo, btw!

I hate that everyone just develops for chromium only

It is working perfectly for me on Firefox (version 127).

  • Thanks for sharing. I did make some changes that seems to have improved things, although I do still see the occasional misfire. Perhaps good enough to remove that ugly red banner though!