← Back to context

Comment by christiansafka

7 months ago

Appreciate the feedback. On the video side, we currently synchronize it to play out with the translated audio (as often as possible), matching when you started speaking to the moment the translated audio starts. Mentioned in another comment but we're still working on audio mimicking (voice clone then inflection transfer). Our model does a lot that Google Translate doesn't, even just around translation, such as taking into account who you're talking to in the meeting and the conversation context. + we have to do it much faster, so smaller audio chunks at a time!