Comment by yujonglee

6 months ago

If your use-case is meeting, https://github.com/fastrepl/hyprnote is for you. OWhisper is more like a headless version of it.

Can you describe how it pick different voices? Does it need separate audio channels, or does it recognize different voices on the same audio input?

  • It separate mic/speaker as 2 channel. So you can reliably get "what you said" vs "what you heard".

    For splitting speaker within channel, we need AI model to do that. It is not implemented yet, but I think we'll be in good shape somewhere in September.

    Also we have transcript editor that you can easily split segment, assign speakers.