Comment by riskable
2 months ago
Oh this is just the tip of the iceberg when it comes to abusing Unicode! You can use a similar technique to this to overflow the buffer on loads of systems that accept Unicode strings. Normally it just produces an error and/or a crash but sometimes you get lucky and it'll do all sorts of fun things! :)
I remember doing penetration testing waaaaaay back in the day (before Python 3 existed) and using mere diacritics to turn a single character into many bytes that would then overflow the buffer of a back-end web server. This only ever caused it to crash (and usually auto-restart) but I could definitely see how this could be used to exploit certain systems/software with enough fiddling.
This was the premise of a Google CTF quals 2024 challenge ("encrypted runner").
Yeah. Zalgo text is a common test for input fields on websites. But it usually doesn't do anything interesting. Maybe an exception trigger on some database length limit. Doesn't typically even kill any processes. The exception is normally just in your thread. You can often trigger it just by disabling JS on even modern forms, but,, at best you're maybe leaking a bit of info if they left debug on and print the stack trace or a query. Another common slip-up is failing to count \n vs \r\n in text strings since JS usually usually counts a carriage return as 1 byte, but HTTP spec requires two.
unescape(encodeURIComponent("ç")).length is the quick and dirty way to do a JS byte length check. The \r\n thing can be done just by cleaning up the string before length counting.
Does Zalgo even work on HN? I've never thought of using it to test my systems, thank you. I've got some new testing to do tonight.
Edit: No, Zalgo doesn't work on HN. This comment itself was an experiment to try.
A few months ago I made a post which I (should've) named "Unicode codepoints that expand or contract when case is changed in UTF-8". A decent parser shouldn't have any issues with things like this, but software that makes bad Unicode assumptions might.
https://news.ycombinator.com/item?id=42014045
Sorry n00b here, can you explain more about this or how you did this? I feel like this is definitely a loophole that would be worth testing for.