← Back to context

Comment by cornholio

7 months ago

I wonder when we will see generative AI codecs in production. The concept seems simple enough, the encoder knows the exact model the decoder will use to generate the final image starting from a handful of pixels, and optimizes towards lowest bitrate and minimum subjective quality loss, for example, by letting the decoder generate a random human face in the crowd, or give it more data in that area to steer it towards the face of the team maskot, as the case may be.

At the absolute compression limit, it's no longer video, but a machine description of the scene conceptually equivalent to a textual script.

There was nvidia videoo upsampling or w/e it is called. It was putting age spots on every face when it was blurry and it used too much resources as far as I can remember

And then that script gets processed on hundreds of GPUs in the cloud and the video gets streamed to the client. Wait.