Comment by pratikdeoghare
8 hours ago
There is one in golang regular expressions https://swtch.com/~rsc/regexp/regexp2.html
I guess that is why you say re.Compile.
8 hours ago
There is one in golang regular expressions https://swtch.com/~rsc/regexp/regexp2.html
I guess that is why you say re.Compile.
That goes back to Ken Thompson's NFA regex interpreter from 1968 [1], [2], [3]. Note: that whole regex series by Russ Cox [4] is great.
[1] https://dl.acm.org/doi/10.1145/363347.363387 -- Programming Techniques: Regular expression search algorithm
[2] https://swtch.com/~rsc/regexp/regexp1.html -- Regular Expression Matching Can Be Simple And Fast
[3] https://swtch.com/~rsc/regexp/regexp2.html -- Regular Expression Matching: the Virtual Machine Approach
[4] https://swtch.com/~rsc/regexp/ -- Implementing Regular Expressions
I second the Russ Cox recommendation. I read that ages ago and that was what made me realise some theory could actually be useful in practice.
All regular expressions are deterministic final automata https://en.wikipedia.org/wiki/Deterministic_finite_automaton (finally, a use for my CS course); the extent to which that counts as a virtual machine varies. Some of the regex syntaxes extend it in ways which don't fit in a DFA and do count as a VM; Perl-compatible RE used to be popular (e.g. in Exim).
It's easier to construct NFAs directly from regular expression definitions (rather than DFAs) because implementing the choice operator is easier. We can convert from NFA to DFA with worst-case exponential blowup.
Inded:
https://wiki.xxiivv.com/site/rewriting.html
Interesting. Not that surprising that it works like this. But isn't it a little surprising that things like regexes, printf syntax and other DSLs aren't mostly handled and parsed at compile time in 2026?
Kind of language-dependent since regexes are normally specified as strings and most languages are pretty weak at "run this code at compile time". One of the things Rust users are fond of.
C# is in the middle on this one, where specific features get compile-time support and regex is one of them: https://www.devleader.ca/2026/05/03/c-regex-performance-gene...
I have also built a C# source generator myself (XML parser generator), but the developer experience is a bit of a hill to climb compared to what it could be.