← Back to context

Comment by golly_ned

2 years ago

If you've seen the video, it's very apparent it's a product video, not a tech demo. They cut out the latencies to make a compelling product video.

I wasn't at all under the impression they were showcasing TTS or low latencies as product features. I don't find the marketing misleading at all, and find these criticisms don't hit the mark.

https://www.youtube.com/watch?v=UIZAiXYceBI

It's not just cutting. The answers were obtained by taking still photos and inputting them into the model together with detailed text instructions explaining the context and the task to the model, giving some examples first and using careful chain-of-thought style prompting. (see e.g. https://developers.googleblog.com/2023/12/how-its-made-gemin...) My guess is that the video was fully produced after the Gemini outputs were generated by a different team, instead of while or before.