Comment by 2dvisio

2 days ago

Still find the Copilot transcripts orders of magnitude worse than something like Wispr Flow and they tend to allucinate constantly and do not adapt to a company's context (that Copilot has access too...). I am talking about acronyms of products / teams, names of people (even when they are in the call), etc.

Can anyone familiar with the technical details shed light on why this is so.

Is it because of a globally trained model (as opposed to trained[tweaked on] on context specific data) or because of using different classes of models.

  • Neither copilot nor flow can natively handle audio to my understanding, so there is already a transcription model converting it to text that then GPT tries to summarise.

    It could be they simply use a mediocre transcription model. Wispr is amazing but would hurt their pride to use a competitor.

    But i feel it's more likley the experience is; GPT didn't actually improve on the raw transcription, just made it worse. Especially as any miss-transcipted words may trip it up and make it misunderstand while making the summary.

    if i can choose between a potentially confused and misunderstood summary, and a badly spellchecked (flipped words) raw transcription, i would trust the latter.

Ye i didn't even think about advanced meetings summary bots. Just raw word for word transcription please. Wispr is pretty great.