Comment by Voranto

9 hours ago

Quick question: Isn't the construction of a NFA - DFA a O(2^n) algorithm? If a JSON file has a couple hundred values, its equivalent NFA will have a similar amount, and the DFA will have 2^100 states, so I must be missing something.

theory is one thing but the cpu cache is the real bottleneck here... here is a small visual breakdown of how these arrays look in memory and why pointer chasing is so expensive compared to the actual logic: https://vectree.io/c/json-array-memory-indexing

basically the double jump to find values in the heap is what slows down these tools most

  • I can see that in practice the bottleneck isn't the automata construction, I'm just curious of how the construction is approached with such a super-exponential conversion algorithm