Comment by o11c
5 days ago
That fundamentally misunderstands the problem in multiple ways:
* this is still during lexing, not yet to parsing
* there are multiple valid token sequences that vary only with a single character at the start of the file. This is very common with Python multi-line strings in particular, since they are widely used as docstrings.
One could fold lexing into the parsing and do error cost minimization on both.