← Back to context

Comment by steinvakt2

1 day ago

This is not a new model. Also, it hallucinates a lot. Also, it's very heavy and slow in inference. It's also bad in multilingual.

Edit: I'm talking purely about speech to text (STT). Not sure about the other things this can do.

It has some perks, is a bit more expressive in some cases, but overall is trained on really noisy data, uses more memory, and isn't that fast - I'm talking about the (7b?) version that they released then removed quickly (vibevoice-community on github) - I still use chatterbox turbo and sometimes qwen TTS.

Yeah, I don't get why it is suddenly getting so much attention today, it is all over twitter too

It is not good for text to speech (TTS) as well. I am trying it for few days. First of all 1.5B model documentation is not there. 0.5B realtime is shit model. I was converting text, line by line and it was randomly adding music and couldn't handle special characters like "…".

I really disappointed with this model to say the least.

  • The 7B parameter Vibevoice TTS model is still the most impressive local TTS model i've tried. It was pulled by Microsoft a few days after its release due to "abuse potential" but it can be found in various community maintained huggingface repos.

  • yep, it seems this was trained on large amount of podcasts with ad jingles or phone call queues with elevator music. I was also pretty disappointed to run the TTS last week.

you saved us a lot of time here.... i unstarred the repo

moving on....

  • I don't really pay attention to stars. Do people use them as bookmarks? Why would you star a repo if you knew so little about it?

    • Stars for me are basically "this might be interesting but I don't have time to look at it now, hopefully I'll think about it later and give it a second look".

    • I exclusively use stars as bookmarks which is why I always found it strange when people talked about lots of stars meaning high quality or trustworthy…I’ve learned since then that I’m probably in the minority (both in using stars as bookmarks and not caring about how many stars a repo has).

    • Judging by how many people apparently are paying bots to give their lazily vibe-coded repos thousands of stars, it seems like people both simultaneously take stars seriously while not taking them seriously at all. It breaks my brain.

I'm shocked, shocked to find that Microsoft takes credit for a slow, unoriginal product that doesn't actually do what it advertises.

  • Imagine the balls it took to willingly attach the Microsoft label to the front of the product that is Teams.

    • I mean the same can be said about most versions of Windows as well. People act like Windows 11 is where it all went sour, but I've personally kind of hated it since Windows XP.

      I feel like a recurring pattern with Microsoft is to create something quickly, market it aggressively and push for everyone to use it immediately, and only once it is installed everywhere do people suddenly realize how terrible it is, but it's too late to change.

      2 replies →