Comment by dehrmann
20 hours ago
Not trolling, but I'd bet something that's augmented with generative AI. Not to the level of describing scenes with words, but context-aware interpolation.
20 hours ago
Not trolling, but I'd bet something that's augmented with generative AI. Not to the level of describing scenes with words, but context-aware interpolation.
I don't want my video decoder inventing details which aren't there. I much rather want obvious compression artifacts than a codec where the "compression artifacts" look like perfectly realistic, high-quality hallucinated details.
In case of many textures (grass, sand, hair, skin etc) it makes little difference whether the high frequency details are reproduced exactly or hallucinated. E.g. it doesn't matter whether the 1262nd blade of grass from the left side is bending to the left or to the right.
And in the case of many others, it makes a very significant difference. And a codec doesn't have enough information to know.
Imagine a criminal investigation. A witness happened to take a video as the perpetrator did the crime. In the video, you can clearly see a recognizable detail on the perpetrator's body in high quality; a birthmark perhaps. This rules out the main suspect -- but can we trust that the birthmark actually exists and isn't hallucinated? Would a non-AI codec have just showed a clearly compression-artifact-looking blob of pixels which can't be determined one way or the other? Or would a non-AI codec have contained actual image data of the birth mark in sufficient detail?
Using AI to introduce realistic-looking details where there was none before (which is what your proposed AI codec inherently does) should never happen automatically.
9 replies →
https://blogs.nvidia.com/blog/rtx-video-super-resolution/
We already have some of the stepping stones for this. But honestly much better for upscaling poor quality streams vs just gives things a weird feeling when it is a better quality stream.
AI embeddings can be seen as a very advanced form of lossy compression
Neural codecs are indeed the future of audio and video compression. A lot of people / organizations are working on them and they are close to being practical. E.g. https://arxiv.org/abs/2502.20762
for sure. macroblock hinting seems like a good place for research.