← Back to context

Comment by ericpauley

3 years ago

I would assume that any image reformatting or exif stripping by online platforms would protect against this. Yet another good reason to include this when developing apps.

This isn't an exif issue.

This isn't a metadata issue.

An underlying IO library changed its behavior so that instead of truncating a file when opened with the "w" mode (as fopen and similar have always done, and this API did originally), it left the old data there. If the edited image is smaller than the original file, then the tail of the original image is left in the file. There is enough information to just directly decompress that data and so recover the pixel data from the end of the image.

You're not necessarily recovering the edited image data, just whatever happens to be at the end of the image. If you are unlucky (or lucky depending on PoV) the trailing data still contains the pixel data from the original image - in principle the risk is proportional to how close to the bottom right of the image the edits were (depending on image format).

  • Not saying it is. Sensible exif stripping (re-serialization) also has the upside of removing trailing data, which would prevent this.

    • No, the whole point is that with this bug is that more filtering or stripping would not fix/prevent it. The problem is not some kind of "trailing data in memory" issue.

      The bug is you say "write to this file" which is meant to erase the existing file if such exists, but the underlying library either had a serious regression, or intentionally broke API compatibility, and changed the behavior to not erase the existing data. Your exif stripping + reserialization would write the new data down and the trailing data from the original file would still be present: e.g. exactly what is happening in this bug.

      No amount of processing in memory, no amount of reserialization, no amount of data filtering prevents this bug. The bug occurs at the point of IO, because the IO is meant to have erased the original file, and it did not, so if you write fewer bytes to the destination file than were present in file being overwritten the tail of the overwritten file remains and is leaked.

      To make it very clear that this is not an error in processing the image: if you opened "image1.png" (or whatever format), edit it, and then saved it over a different file that already exists, say "image2.png", and then send image2.png to someone, this bug will allow the recipient to extract the trailing data for the original image2.png, it would not show any information about the original image1.png.

      3 replies →

EXIF stripping won't necessarily catch it (but probably would in most instances - depends on how you do it), but reformatting or reencoding will.

  • I’m guessing most exif stripping would deserialize the image and write a new file, so unless that has the same bug as this (overwriting the existing file without truncation), it ought to work?

    • Discord strips EXIF but the author was still able to unredact the images they'd posted there.

      Some implementations of EXIF stripping might help, but it's not guarenteed.

      4 replies →

    • A naive approach to stripping EXIF from a PNG would be to parse up to the start of the first eXIf chunk, discard the contents of that chunk, and then include the rest of the file verbatim without actually parsing anything.

      But yes, a more sensibly coded EXIF stripper would deserialise and reserialise. Unfortunately I am no longer able to assume that programmers will behave sensibly.

      Edit: Also, the PNGs generated by Markup don't contain EXIF in the first place, so an EXIF stripper could reasonably decide that no changes are necessary at all.

      2 replies →