Comment by onei
6 years ago
Isn't that pretty standard for a parser? When was the last time a compiler bailed on the very first error it hit and refused to do anything else?
The solution is to pick synchronisation points to start parsing again, i.e. ; at the end of a statement or } at the end of a block.
> When was the last time a compiler bailed on the very first error it hit and refused to do anything else?
Make still does this. (That's the "Stop." in the famous "* missing separator. Stop.") Many errors in Python still do this.
As late as 2010 I still saw some major C compilers do this.
99% of the toy compilers written for DSLs do this, or worse.
Good error recovery / line blaming is still an active field of development.
> Good error recovery / line blaming is still an active field of development.
True. But let's get terminology straight: that's not a compiler science, that's parsing science. And it's no more compiler science than parsing a natural language is.
What terminology are you talking about? Neither "compiler science" nor "parsing science" are terms I used, or that the industry or academia use.
Parsing - formal theory like taxonomies of grammars, and practical concerns like speed and error recovery - remain a core part of compiler design both inside and outside of academia.
How can you be sure that that } is the end of a certain defined block? This most importantly affects the scoping and in many cases it's ambiguous. IDEs do have rich metadata besides from the source code but then parsers should be aware of them.
You're ignoring the ; which are sync points.
> How can you be sure that that } is the end of a certain defined block
If it's not in a string, what else is it but a typo? If a typo, it fails to parse but so long as it doesn't crash, fine.
Maybe my wording is not accurate, imagine the following (not necessarily idiomatic) C code:
This code doesn't compile, so the IDE tries to produce a partial AST. A naive approach will result in the first } matching with the second {, so `x += 42;` will cause a type error. But as noticable from the indentation, it is more believable that there was or will be } matching with the second { at the caret position and `x += 42;` refers to the outer scope.
Yes, of course parsers can account for the indentation in this case. But more generally this kind of parsing is sensitive to a series of edit sequences, not just the current code. This makes incremental parsing a much different problem from ordinary parsing, and also is likely why ibains and folks use packrat parsing (which can be easily made incremental).