Comment by anyfoo
4 years ago
That's essentially a variant of "bigger pixels". Just like them, your algorithm cannot guarantee that an unknown codec will still make the whole thing perform adequately.
Even if you train your model to work best for all existing codecs (I assume that's the "ML" part of the ML model), the no free lunch theorem pretty much tells us that it can't always perform well for codecs it does not know about.
(And so does entropy. Reducing to absurd levels, if your codec results in only one pixel and the only color that pixel can have is blue, then you'll only be able to encode any information in the length of the video itself.)
It's not guaranteed to perform well with unknown or new codecs - true. But, the implicit assumption is that YouTube will use codecs that preserve what videos look like - not just random codecs. If that assumption holds then the image recognition model will keep working even with new codecs.
That's the thing though, "looks like" breaks down pretty quickly with things that aren't real images. It even breaks down pretty quickly with things that are real, but maybe not so common, images: https://www.youtube.com/watch?v=r6Rp-uo6HmI
So one question would be: Does your image generation approach preserve a higher information density than big enough pixels?
Why would you assume that the images in my algorithm aren't real images? For example, you could use 256 categories from imagenet as your keys. Image of a dog is 00000000, tree is 00000001, car 00000010, and so on.
1 reply →