Comment by Locke1689
11 years ago
The C# compiler is written in C#, so you can use it anywhere with a .NET runtime. Microsoft doesn't ship a .NET runtime for Linux, so you'll have to use Mono, but if your app is compatible with Mono, you're good to go!
How does something like this happen? How can you write a compiler in the language that it is supposed to be compiling?
TLDR: You can't write the first compiler for a language in the language. However, you can write the second compiler in the language and use the first compiler to compile it.
If you already have a compiler in the language you're compiling, you can update the compiler with new features by the following process, called "bootstrapping". This process is used in gcc for example:
1. Use the old binary version to compile the new source version.
2. Use the binary produced in step 1 to compile the new source version.
3. Use the binary provided in step 2 to compile the new source version.
The results of stages 2 and 3 should be the same (assuming the old and new compilers assign the same semantics to the compiler source code and don't have non-determinism, e.g. using wall-clock time to determine when to stop searching for optimizations).
The bootstrap process can't be used on the first compiler for a brand-new language, because there is no "old compiler" that can be used in stage 1. The only way around this is to write the first compiler in a different language that already exists.
Of course, if a self-hosted compiler is your goal, you can afford letting the very first compiler (the one written in another language) be limited, "dirty," or "hacky". You don't have to implement the entire language in the first compiler; the first compiler just has to support enough of the new language to allow you to write the second compiler.
Anyway, once you have the first compiler, you write the first version of the second compiler in whatever subset of the language the first compiler supports. Once the first compiler builds the first version of the second compiler, the first compiler is no longer needed; you can add all new features to the second compiler only.
New versions of the second compiler (perhaps supporting more language features) can now be produced by bootstrapping. Of course, you can also use the second compiler to compile a totally different compiler implementation written in the new language (this would be the third compiler for the new language).
First you solve the chicken and egg problem by writing a bootstrap compiler for the language in some other language (e.g. a C# compiler in C or C++, or a C compiler in assembly language). Then you can compile programs in the new language, and compilers are programs, so you can write a C# compiler in C# and the binary so produced will be able to compile its own source.
Most compilers I know of start with an implementation in another language and eventually write a self-hosted one. Microsoft already had compilers for .NET, though, so you just compile it with those until it's self-hosting.
Bootstrapping a compiler from nothing is harder and involves a lot of incremental steps. http://en.wikipedia.org/wiki/Bootstrapping_(compilers)
Funny you should mention that! There was a really cool article about bootstrapping a compiler nearly from scratch just the other day[1]. But yeah, most people just start with C or C++ until they can compile enough of the language to write a compiler in it.
Some other interesting contemporary examples:
I think as we speak Go is working on its plan to switch from using a compiler written in Go[2].
Rust[3] has kind of a hybrid approach where the front-end of the compiler is written in rust, but generates llvm bytecode which is then compiled the rest of the way by llvm itself. It takes three "stages" for them to do a full compile, first a binary "snapshot" compiler is downloaded and used to compile the compiler (stage 0), then that generated compiler is used to compile the compiler again (stage 1), and then that generated compiler is used to compile the compiler again (stage 2). Stage 2 is actually just a test - its output should be identical to that of stage 1, and if it isn't, something went wrong.
There is a classic lecture from Ken Thompson[4] that I think is really illuminating on how this stuff works, if you're unfamiliar with it.
It's really fascinating stuff!
[1]: http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.... [2]: https://docs.google.com/document/d/1P3BLR31VA8cvLJLfMibSuTdw... [3]: http://www.rust-lang.org/ [4]: http://cm.bell-labs.com/who/ken/trust.html
They wrote the original C# compiler in a different language (probably C++) first.
Then they wrote the second C# compiler in C#, and used the first compiler to compile the second compiler. Voila!
We regularly dogfood, so we're on like N iterations of your structure, but that's the idea!
3 replies →
http://en.wikipedia.org/wiki/Bootstrapping_(compilers)
This is almost second-nature in Lisp:
http://c2.com/cgi/wiki?StructureAndInterpretationOfComputerP...
And tends to happen to other languages... you just have to bootstrap the first compiler.
Is Mono able to compile Roslyn?
I assume you mean the Mono C# compiler (the one that the Mono team develops).
Interestingly, no -- but neither can the native C# compiler (that's what we call the old C# compiler that everyone's using in VS right now)!
Why? The C# compiler is bootstrapped, so we actually compile the compiler with the compiler. It just so happens that a while ago we felt we needed a feature so badly for the C# compiler that we introduced it into the C# language (exception filters) and started using it in the compiler immediately. So in order to compile the compiler, you need a C# compiler that supports exception filters, which right now is just the Roslyn C# compiler. Honestly, the feature (and some other small changes we've made) aren't that complicated, so they could be backported to other compilers, but I don't know why you would want to do that.
Heh.
A few years ago I was on a small team working on a .NET compiler and runtime -- it compiled MSIL into native code targeting x86 and PowerPC. I remember when working on the part of the compiler that built the exception handling routines running into exception filters and wondering what they were for. We were largely targeting C# as a development environment at the time but ended up implementing them in case anyone wanted to use VB.NET. Funny to see that they're now making their way back into C#, even if it's just in the compiler :)
1 reply →
It should. Here's the Roslyn source, and it looks to be all in C# and VB: http://roslyn.codeplex.com/