Comment by landl0rd
3 days ago
Flip side is a solution where I can have a book without an audiobook auto-generated (or use an existing ebook rather than paying audible $30 for their version) and it's "good enough" is a legit improvement. AI generated isn't as good but it's better than nothing. Also, being able to interrupt and ask for more detail/context would be pretty nice. Like I'm reading some Pynchon and I have to stop sometimes and look up the name of a reference to some product nobody knows now, stuff like that.
If you're willing to forgo the interactive LLM bit, kokoro-tts (just a script using Kokoro-ONNX) takes epubs and outputs a series of wavs or mp3s that need to be stitched together into chapters or audiobook m4a with some ffmpeg fu. I've listened to several generated audiobooks, and found them pretty good. Some nice generic narration-like prosody. It uses espeak-ng to generate phonemes and passes those to the model to render voice, so it generally pronounces things quite well. It comes with a handful of nice voices and several can be blended, but no easy voice cloning, like chatterbox, that I'm aware of.
https://github.com/nazdridoy/kokoro-tts/blob/main/kokoro-tts
I've used this repo and its great. It was one many things that inspired me in building a similar tool. I built https://desktop.with.audio
It was important to me that it be 100% private and local and wanted it to be a one time payment solution. Because it locally process your data it can be a one time payment text to speech app.
If you are interested in creating audiobooks from epubs check this demo: https://www.youtube.com/watch?v=pOHzo6Oq0lQ If you are interested in listening while reading with text highlighting check these demos: - https://www.youtube.com/watch?v=8yJ-lsbzAuw - https://www.youtube.com/watch?v=y8wi4d8xmnw
audiblez[1] does exactly that and handles the ffmpeg fu part for you, and will output a m4b file which audio book players will support.
1. https://github.com/santinic/audiblez
I've been using epub2tts / epub2tts-edge and its been working well for me. Converts into m4b