Comment by echelon
8 months ago
This is close to SOTA emotional performance, at least the female voices.
I trust the human scores in the paper. At least my ear aligns with that figure.
With stuff like this coming out in the open, I wonder if ElevenLabs will maintain its huge ARR lead in the field. I really don't see how they can continue to maintain a lead when their offering is getting trounced by open models.
Hmmmm… what is your opinion on the examples showcased here vs the ones on the Dia demo page?
https://yummy-fir-7a4.notion.site/dia
I am not sure why but I find the pacing of the parakeet based models (like Dia) to be much more realistic.
11labs is facing a real competitor