Comment by pratikdeoghare

8 hours ago

There is one in golang regular expressions https://swtch.com/~rsc/regexp/regexp2.html

I guess that is why you say re.Compile.

That goes back to Ken Thompson's NFA regex interpreter from 1968 [1], [2], [3]. Note: that whole regex series by Russ Cox [4] is great.

[1] https://dl.acm.org/doi/10.1145/363347.363387 -- Programming Techniques: Regular expression search algorithm

[2] https://swtch.com/~rsc/regexp/regexp1.html -- Regular Expression Matching Can Be Simple And Fast

[3] https://swtch.com/~rsc/regexp/regexp2.html -- Regular Expression Matching: the Virtual Machine Approach

[4] https://swtch.com/~rsc/regexp/ -- Implementing Regular Expressions

  • I second the Russ Cox recommendation. I read that ages ago and that was what made me realise some theory could actually be useful in practice.

All regular expressions are deterministic final automata https://en.wikipedia.org/wiki/Deterministic_finite_automaton (finally, a use for my CS course); the extent to which that counts as a virtual machine varies. Some of the regex syntaxes extend it in ways which don't fit in a DFA and do count as a VM; Perl-compatible RE used to be popular (e.g. in Exim).

Interesting. Not that surprising that it works like this. But isn't it a little surprising that things like regexes, printf syntax and other DSLs aren't mostly handled and parsed at compile time in 2026?

  • Kind of language-dependent since regexes are normally specified as strings and most languages are pretty weak at "run this code at compile time". One of the things Rust users are fond of.

    C# is in the middle on this one, where specific features get compile-time support and regex is one of them: https://www.devleader.ca/2026/05/03/c-regex-performance-gene...

    I have also built a C# source generator myself (XML parser generator), but the developer experience is a bit of a hill to climb compared to what it could be.