Comment by zhyder
3 days ago
V v cool: first time I've seen such expressiveness in TTS for laughs, coughs, yelling about a fire, etc!
What're the recommended GPU cloud providers for using such open-weights models?
3 days ago
V v cool: first time I've seen such expressiveness in TTS for laughs, coughs, yelling about a fire, etc!
What're the recommended GPU cloud providers for using such open-weights models?
Thanks you!! We personally used Quickpod and Runpod the most. But you can try it now on HF Spaces without spinning up GPUs yourself!
https://huggingface.co/spaces/nari-labs/Dia-1.6B
> first time I've seen such expressiveness in TTS for laughs, coughs, yelling about a fire, etc!
The old Bark TTS is noisy and often unreliable, but pretty great at coughs, throat clears, and yelling. Even dialogs... sometimes. Same Dia prompt in Bark: https://vocaroo.com/12HsMlm1NGdv
Dia sounds much more clear and reliable, wild what 2 people can do in 3 months.