Exploiting aCropalypse: Recovering truncated PNGs

3 years ago (da.vidbuchanan.co.uk)

76 comments

Retr0id

Very simple script to check which PNG files have trailing data: https://gist.github.com/Juerd/40071ec6d1da9d4610eba4417c8672...

I hope that people who host forums, image boards, chat applications, etc., will delete or fix potentially vulnerable images before anyone uses them maliciously.

One way to repair a vulnerable image is to use `optipng -fix`.

johdo 3 years ago

>Google was passing "w" to a call to parseMode(), when they should've been passing "wt" (the t stands for truncation). This is an easy mistake, since similar APIs (like POSIX fopen) will truncate by default when you simply pass "w". Not only that, but previous Android releases had parseMode("w") truncate by default too! This change wasn't even documented until some time after the aforementioned bug report was made.

Reading about this silent API change makes me feel like I'm losing braincells. What's going on with the processes behind Android's development?

Jasper_ 3 years ago
This is par for the course for Android, having had to work with it at the hardware enablement level. Google will refactor and break everything in a dot release, and then swear up and down that their code is perfect, even as you point them directly to the commit that caused the issue.
This is clearly unacceptable, but I've seen so much worse.
- BoorishBears 3 years ago
  
  > and then swear up and down that their code is perfect
  Even worse when they don't do this and just flat out admit they broke a usecase intentionally and label it Won't Fix because the team that implemented the breaking change is also the team that triages the issue. See logcat access being completely broken in Android 13: https://www.reddit.com/r/tasker/comments/wpt39b/dev_warning_...
- yazzku 3 years ago
  
  That's completely fucked up. Arrogance much or?
  
  1 reply →
- ofchnofc 3 years ago
  
  [dead]
BoorishBears 3 years ago
I have some pretty unique insight into this since I work with AOSP a lot and have worked with a few engineers on Android's core system apps:
Google's engineers working on Android at a system level regularly break basic functionality in the "userspace"*. Google's engineers working on Android apps get early access to the Android versions and work through the resulting bugs, bubbling them back up until they get fixed.
(*userspace being used loosely here, it's all userspace vs being in a kernel, but it's interfaces that are implemented at the OS level and consumed at the app level)
Like Google is large enough that I'm sure someone will take offense to implying that such questionable engineering takes place there, but this isn't a story I've heard just once. People working on apps that are part of every GMS enabled Android image have confirmed this completely bizarre setup on multiple separate occasions
- saagarjha 3 years ago
  
  Of course, this issue did not get fixed in Google’s apps.
  
  1 reply →
bananapub 3 years ago
? it sounds like you didn't look at the actual commit that changed it. it was an overly elaborate refactoring gone wrong, not someone explicitly and clearly deleting a "case 'w'" or whatever.
- iainmerrick 3 years ago
  
  That isn’t any better!
- johdo 3 years ago
  
  It sounds like you don't have any understanding of sound engineering if you think modifying the default behavior of this kind of API with no fanfare is okay because "we are elaborately refactoring". Whether there was a specific intent or not.
  This is absolutely horrifying.

Groxx 3 years ago

>IMHO, the takeaway here is that API footguns should be treated as security vulnerabilities.

Yeah, especially in this case, due to changing defaults and similar-but-differently-behaving APIs.

Defaults really suck sometimes. But so does not having any. And so many things can become security issues when used just so.

olliej 3 years ago

See that's not what happened here. It wasn't that the API had a footgun (I'll leave out "is this API actually good"). It was that someone decided that changing core API behaviour after that library had shipped was acceptable - and it isn't.
That's why shipping a new API requires a lot of time investment in the design of the API: once an API is shipped you can't just change the behavior dramatically.

Retr0id 3 years ago

Earlier related discussion https://news.ycombinator.com/item?id=35207787

SergeAx 3 years ago

This is literally the hacker news. I wish HN have more content like this instead endless topics swirling around money.

GistNoesis 3 years ago

In the same line of idea of aCropalypse, there is the classic mistake beginner make redacting jpg documents with black box.

It does leak info from inside the box due to jpg compression artifacts.

Here is the proof of concept I quickly wrote to show why this isn't safe : https://github.com/unrealwill/jpguncrop

zamadatix 3 years ago
I'm not following the "uncrop" portion of jpguncrop. Sure, the images are different, but unlike aCropalypse it's not clear how the image is supposed to be reconstructed from the data without already having knowledge what the data was. All this says is "yep, there was something other than a black square there before" and it's fair to say you could figure that out by the presence of the black square in the first place.
- GistNoesis 3 years ago
  
  The two cropped files are different. They are built deterministically, therefore you know whether the redaction was of a red circle or of a blue circle.
  Typically that apply to things like redacted pdf lossily compressed as image, for which you have already a few candidates words (and you can bruteforce). You try them one at a time and see whether the compression artifacts match.
  The uncropping algorithm is pretty straightforward in theory : remove jpg artifacts, and fill the cropped region with candidate image portion x, compress and compare the produced artifacts to the cropped image artifacts, try a neighbor candidate x+eps*N(0,1) and optimize (aka random search).
  The artifacts are related to Fourier coefficients so the distance between artifacts isn't too irregular.
  The remove jpg artifacts, can range from really simple to really hard depending on the class of problem you have.
  If the background image is something digitally generated (like here in our example of red and blue circles), or a pdf file you can get the uncompressed version without mistakes.
  If the background image is something like a photo then you need a little finesse : you need to estimate the compression level, then run a neural network outside the cropped regions (4 different images : above, below, left and right of the cropped region) that remove the artifacts (something like https://vanceai.com/jpeg-artifact-removal/ but finetune to your specific compression level), so you can estimate the artifacts, then you search for the image inside the region (eventually with a neural net prior but that increase the probability of data hallucination) such that the jpg artifacts of the compressed reassembled image, is close to the jpg artifacts of your cropped image.
  
  7 replies →
ofchnofc 3 years ago

[dead]

yazzku 3 years ago

Great find.

"The end result is that the image file is opened without the O_TRUNC flag, so that when the cropped image is written, the original image is not truncated. If the new image file is smaller, the end of the original is left behind."

RIP

ericpauley 3 years ago

I would assume that any image reformatting or exif stripping by online platforms would protect against this. Yet another good reason to include this when developing apps.

olliej 3 years ago
This isn't an exif issue.
This isn't a metadata issue.
An underlying IO library changed its behavior so that instead of truncating a file when opened with the "w" mode (as fopen and similar have always done, and this API did originally), it left the old data there. If the edited image is smaller than the original file, then the tail of the original image is left in the file. There is enough information to just directly decompress that data and so recover the pixel data from the end of the image.
You're not necessarily recovering the edited image data, just whatever happens to be at the end of the image. If you are unlucky (or lucky depending on PoV) the trailing data still contains the pixel data from the original image - in principle the risk is proportional to how close to the bottom right of the image the edits were (depending on image format).
- ericpauley 3 years ago
  
  Not saying it is. Sensible exif stripping (re-serialization) also has the upside of removing trailing data, which would prevent this.
  
  4 replies →
Retr0id 3 years ago
EXIF stripping won't necessarily catch it (but probably would in most instances - depends on how you do it), but reformatting or reencoding will.
- ericpauley 3 years ago
  
  I’m guessing most exif stripping would deserialize the image and write a new file, so unless that has the same bug as this (overwriting the existing file without truncation), it ought to work?
  
  8 replies →

zorlack 3 years ago

It'd be so interesting to collect aCropalypse-affected images. Maybe you could build a crop-suggester out of it...

Not that I'd want to maintain custody of such a dataset...

mzs 3 years ago

>Windows Snipping Tool is vulnerable to Acropalypse too.

…

>This also applies to the "Snip & Sketch" tool in Windows 10.

https://twitter.com/David3141593/status/1638222624084951040

hayst4ck 3 years ago

If anyone wants to explore their PNG files, or files of any type for that matter, hachoir is a very fantastic tool.

https://hachoir.readthedocs.io/en/stable/urwid.html

Wingman4l7 3 years ago

Why the hell is this exploit being fully provided for use via a handy-dandy web interface? An image /cleanup/ tool is one thing... this is very irresponsible.

andersa 3 years ago
I wonder if hiding the tool would help. Anyone interested could simply archive and hoard potentially interesting images until such tool emerges later. So in reality, it would change nothing, only slightly delay the images being extracted.
The only thing I can think of that would have made a real difference is to send a tool to fix the images to all image hosting platforms in advance. But which ones do you trust?
- batmanthehorse 3 years ago
  
  I think making this tool readily available right now is doing to result in a lot of people being doxxed who otherwise wouldn’t be.
  Some people would just lose interest if there isn’t an easy tool immediately available, and also it would give potential victims or image hosts more time to fix or delete vulnerable pics.
  
  3 replies →
moosedev 3 years ago
That was my first thought when I clicked on the website link in the Twitter thread -- expecting a disclosure/high-level info page in the fashion of the last decade of big-deal exploits with cute names -- and found only a tool the tweet author (not OP, but apparently working with him?) built that runs in-browser, requires no knowledge/setup, and appears to enable recovery of cropped-out image data at scale by even non-technical users. Jeez.
Edit: I find myself wryly weighing this against the ongoing unleashing of LLMs upon the world. Both have shades of clever people prioritizing being and demonstrating clever at the cost of... other stuff. On the bright side, it is distracting me from facepalming at the underlying Pixel bug.
- noirscape 3 years ago
  
  The bug is so simplistic (yet also damaging) that you can't really do it high info. Google Markup doesn't truncate the file properly before writing new data to it (due to a mixture of bad coding and a bad Android API change in Android 10).
  All the tool seems to do is just read out whatever comes after the end of the PNG and then supply the missing data to construct an image that can be rendered.
alwayslikethis 3 years ago
If you send me some extra information than you intend, nothing stops me from just looking at it.
- Wingman4l7 3 years ago
  
  Of course not -- but you still have to put in the effort to "just look at it". They set the bar on that effort extremely low, taking an exploit that required expertise to deploy, and put it in the hands of anyone who could operate a web form.
jasonmp85 3 years ago

Google is irresponsible (current, not past tense, is and was always).
Everything after that is fair game.

boosteri 3 years ago

Thought it's common knowledge the proper way to redact things is by masking it physically, then re-doing a photo/scan of an item.

easrng 3 years ago

then you get the printer tracking dots :)
bastawhiz 3 years ago
Sadly I don't have a second phone that I can keep with me to photograph my phone with, so I'm stuck with cropping
- mfcl 3 years ago
  
  Screenshot, crop... then screenshot again! :D

wildylion 3 years ago

Randy definitely should make a cool panel-escape-vulnerability xkcd piece about this.

Waterluvian 3 years ago

EXIF metadata is useful but we strip it when we post an image because it’s also a security vulnerability.

Image edit metadata also seems like an incredibly useful feature. Do we just strip it as well?

Retr0id 3 years ago
Since you read the article beforehand, you know that this comment is entirely orthogonal to the vulnerability in question.
- Waterluvian 3 years ago
  
  I think it’s okay to talk about the core issue that leads to that. From the linked tweet it looks like there’s edit data stored in the image, allowing the original to be recovered?
  Do you have a specific concern to warrant your comment?
  
  8 replies →