Comment by SyneRyder

6 hours ago

I did the exact same voice memo thing too, except I had Claude make an Android app to record the file and send it to Whisper. In the end I had the app just email the transcription & trigger Claude that way (ie receiving the email triggers my PC to wake up Claude), rather than sending Claude the audio file directly.

My reverse audio reply loop is convoluted - I have Claude generate its TTS file from Whisper/Mistral, and upload them to a server with an RSS file it updates, so I can play them in my podcast app (AntennaPod), then send me a notification via Pushover that the reply is waiting. I ended up building out an MCP tool for that workflow, so Claude really just calls the MCP tool with the text of what it wants to say, everything else is a deterministic program doing the work.

Memory is really useful to have - it can just be a bucket of searchable Markdown files. It's also useful to have a "reminders to self" Markdown file that Claude reads each time, and that Claude can update. I don't continue the same context window, and that "reminders to self" plus the ability to read previous emails in the conversation seems to be enough to keep the context going for me.

You'll feel better if you know exactly how your Claw is locked down. Mine doesn't have the open email access others are granting, not at all. Claude gets a bit grumpy about that and keeps begging for more access :)

0 comments

SyneRyder

No comments yet

Contribute on Hacker News ↗