Comment by Versipelle
3 days ago
This is really impressive; we're getting close to a dream of mine: the ability to generate proper audiobooks from EPUBs. Not just a robotic single voice for everything, but different, consistent voices for each protagonist, with the LLM analyzing the text to guess which voice to use and add an appropriate tone, much like a voice actor would do.
I've tried "EPUB to audiobook" tools, but they are really miles behind what a real narrator accomplishes and make the audiobook impossible to engage with
Realistic voice acting for audio books, realistic images for each page, realistic videos for each page, oh wait I just created a movie, maybe I can change the plot? Oh wait I just created a video game
Now do it in VR and make it fully interactive.
Wouldn’t it be more desirable to hear an actual human on an audiobook? Ideally the author?
> Wouldn’t it be more desirable to hear an actual human on an audiobook? Ideally the author?
Of course, but it's not always available.
For example, I would love an audiobook for Stanisław Lem's "The Invincible," as I just finished its video game adaptation, yet it simply doesn't exist in my native language.
It's quite seldom that the author narrates the audiobooks I listen to, and sometimes the narrator does a horrible job, butchering the characters with exaggerated tones.
Why a human? There are many cases where I like a book but dislike the audiobook speaker, so I essentially can't listen to that book anymore. With a machine, I can tweak the voice to my heart's content.
And get a completely wrong/bland but custom read of the book. Reading is much more than simply transforming text to audio.
1 reply →
With 1M+ new books every year, that’s not possible for all but the few most popular.
It'd be nice if there were mainstream releases on GBC/GBA/PSP again too! But apparently if there's no money in something then people don't really wanna do it.
Honestly, I’d say that’s true only for the author. Anyone else is just going to be interpreting the words to understand how to best convey the character / emotion / situation / etc., just like an AI will have to do. If an AI can do that more effectively than a human, why not?
The author could be better, because they at least have other info beyond the text to rely on, they can go off-script or add little details, etc.
As somebody who has listened to hundreds of audiobooks, I can tell you authors are generally not the best choice to voice their own work. They may know every intent, but they are writers, not actors.
The most skilled readers will make you want to read books _just because they narrated them_. They add a unique quality to the story, that you do not get from reading yourself or from watching a video adaptation.
Currently I'm in The Age of Madness, read by Steven Pacey. He's fantastic. The late Roy Dotrice is worth a mention as well, for voicing Game of Thrones and claiming the Guinness world record for most distinct voices (224) in one series.
It will be awesome if we can create readings automatically, but it will be a while before TTS can compete with the best readers out there.
1 reply →
You really think people writing these papers actually have good speaking voices? LOL, theirs a reason not everyone could be an audio book maker or podcaster, a lot of peoples voices suck for audiobooks