← Back to context

Comment by fennecbutt

2 days ago

What's wrong with the wakeword stuff?

Great timing as I was looking into it yesterday as was thinking about writing my own set of agents to run house stuff. I don't want to spent loads of time on voice interaction so HA wakeword stuff would've been useful. If not I'll bypass HA for voice and really only use HA via mcp.

I can do fw dev for micros...but omg do I not want to spend the time looking thru a datasheet and getting something to run efficiently myself these days.

You can use the vendor supported wakewords, and they are generally pretty good.

However-> These are device specific. The devices I purchased for this purpose have very few vendor supported wakewords, but even more prominently, refuse to integrate with HA. Possible firmware issue, but I have reloaded the firmware 30 times. I dont necessarily want to purchase something else for this purpose. Which is where building a bespoke HA audio box becomes its own can of worms.

But if you want a custom wake word, or more like a wake phrase, you go down a rabbit hole of training/cost/memory etc that starts to get annoying fast.

I kind of know I am being unreasonable. I dont want a device that just ships off everything it hears to an LLM, even local, that would suck. I just want a third way.

Then theres other stuff. Like HA has a hard time with providing context to an LLM, because it sends the whole conversation thus far off to the LLM for context. It can get really weird really quickly. This caused me a lot of issues with lights for example. It would remember switching a light on, and if that was in the context, would refuse to switch it on a second time if it turned off due to a rule or manual intervention. But if you dont send the context, you cant have deeper conversations. You cant ask subsequent questions basically.

  • On my new AMD laptop, it took about 90 minutes to run 50k training rounds on OpenWakeWord.

    It's not really a big burden.

    A tiny AI running locally is the third option you want. That's the only reasonable way to do configurable wake word detection