Comment by stevens32
6 years ago
For the regex novices here, would anyone mind explaining what that pattern is meant to match? More specifically, what `.(?:.=.*)` is meant to do?
6 years ago
For the regex novices here, would anyone mind explaining what that pattern is meant to match? More specifically, what `.(?:.=.*)` is meant to do?
It's meant to match any number of any characters, then match an equal sign, then match any number of any characters. But it's very badly written. It should instead simply be written
BTW, your comment got mangled by HN's markdown formatting.
Gotcha - I thought I was just missing the point as to why it wasn't simpler since it looked to have been structured that way intentionally.
I could tell you but I also want to tell you to plug that bad boy into https://regex101.com/ . It will give you a written explanation of the regexp on the right. And now, would you have believed I knew without peeking? ;)
That is a super handy site, thanks!
The bottom of the post elaborates it.
They said it was for XSS detection. I think the purpose was to identify reflected XSS by looking for paths or headers containing JavaScript-esque variable assignment (JS keywords/syntax preceding "something=something"), but not 100% sure.
The appendix does a pretty good job, but the TL;DR is: it will match any string containing 1 or more equal signs.
Slightly more verbosely, it will match [0-or-more bytes of anything] followed by [0-or-more bytes of anything] followed by [an equal sign] followed by [0-or-more bytes of anything]. The expensive part is that it can't decide where the first grouping of [0-or-more bytes of anything] starts and the second grouping begins. It doesn't matter where the division is, of course, but many regex engines use an exponential-time algorithm for that, even though an obvious liner-time algorithm exists (and pre-dates the exponential-time algorithm!).