← Back to context

Comment by olliej

3 years ago

This isn't an exif issue.

This isn't a metadata issue.

An underlying IO library changed its behavior so that instead of truncating a file when opened with the "w" mode (as fopen and similar have always done, and this API did originally), it left the old data there. If the edited image is smaller than the original file, then the tail of the original image is left in the file. There is enough information to just directly decompress that data and so recover the pixel data from the end of the image.

You're not necessarily recovering the edited image data, just whatever happens to be at the end of the image. If you are unlucky (or lucky depending on PoV) the trailing data still contains the pixel data from the original image - in principle the risk is proportional to how close to the bottom right of the image the edits were (depending on image format).

Not saying it is. Sensible exif stripping (re-serialization) also has the upside of removing trailing data, which would prevent this.

  • No, the whole point is that with this bug is that more filtering or stripping would not fix/prevent it. The problem is not some kind of "trailing data in memory" issue.

    The bug is you say "write to this file" which is meant to erase the existing file if such exists, but the underlying library either had a serious regression, or intentionally broke API compatibility, and changed the behavior to not erase the existing data. Your exif stripping + reserialization would write the new data down and the trailing data from the original file would still be present: e.g. exactly what is happening in this bug.

    No amount of processing in memory, no amount of reserialization, no amount of data filtering prevents this bug. The bug occurs at the point of IO, because the IO is meant to have erased the original file, and it did not, so if you write fewer bytes to the destination file than were present in file being overwritten the tail of the overwritten file remains and is leaked.

    To make it very clear that this is not an error in processing the image: if you opened "image1.png" (or whatever format), edit it, and then saved it over a different file that already exists, say "image2.png", and then send image2.png to someone, this bug will allow the recipient to extract the trailing data for the original image2.png, it would not show any information about the original image1.png.

    • This is not the case when the exif stripping is happening at the service side (By online platforms, in my original comment). Yes, anything happening before save is useless because the trailing data is kept. But if a service (e.g., Facebook) then does exif stripping via re-serialization the trailing data is lost.

      2 replies →