← Back to context

Comment by IshKebab

4 years ago

"What if we could make entire programs as unreadable as regexes?" -K

And yet regular expressions are a part of most programming languages as a library or built-in syntax.

Why is that?

  • Probably due to they're a great notation for the problem area which for regex is concisely describing text patterns.

    For example 'a*b' is any number of 'a's followed by 'b'.

    How else would you concisely state that?

    • How else would you concisely state that?

      Presumably people who hate array languages think all 3 character regexes should instead be big nested loops, so they are "readable".

    • Regexes in the Unix tradition are a user interface as much as a programming language. Not that there's a sharp distinction, but it's almost a trite observation that regexes per se shine for ad hoc string searching but show their weakness when they start becoming parts of programs.

      When writing a program, I prefer to use a PEG, giving the less compact notation `'a'* 'b'` but also letting me say `'a'* b` and define b as its own rule, including recursion for the useful cases. It helps that it's more powerful, being little more than a formalization of the post-regular strategies used in Perl-style 'regular' expressions while embracing recursion.

      For '/' in vim, grep, wherever? Yeah regex is fine, that's what it was designed for.

    • I can't remember the names but I've seen at least two alternative syntaxes recently that are a lot more readable. At least one of them fixed the issue of regex mixing up control in-band with data. So your example would be something like

          "a"* "b"
      

      Much more readable and less error-prone.

      3 replies →

  • They're very quick to write and they (in appropriate cases) would be quite difficult to implement otherwise (tedious state machine stuff). They are still massively overused though. Using a regex at all is a huge red flag. Sometimes they are appropriate, but not in 90% of cases in my experience.

    Anyway I'm not sure the same is true for K. At least for the given example the for loop was not exactly difficult to write.