Comment by ratmice
6 years ago
The assertion "all compilers were pure functions", is a strange one, because it is almost entirely backwards.
the purity of compilers was abandoned almost immediately (when they started creating a file a.out and writing to that instead of writing binaries to stdout, and in the c-preprocessor when #include was added, and in the assembler with the .incbin directive, If compilers were pure, there would be zero need for Makefile style build systems which stat files to see if they have changed.
while Makefiles and their ilk are modeled as a dag is true, The only reason an external file/dag is actually necessary is due to impurity in the compilation process.
There have been very few compilers which have even had a relatively pure core (TeX is the only one that I can actually think of), language servers are if anything moving them to a more pure model, simply due to the fact that its sending sources through some file descriptor rather than having to construct some graph out of filenames.
Long story short, "purity" in the sense of a compiler is a function from source text -> binary text, "foo.c" is not source text, and a bunch of errors is not binary text.
At least language servers take in source text as input.
> the purity of compilers was abandoned almost immediately (when they started creating a file a.out and writing to that instead of writing binaries to stdout
I don't understand your point. A function doesn't cease to be a function if it sends it's output somewhere else.
> and in the c-preprocessor when #include was added,
The C preprocessor is not the compiler. It's a macro processor that expands all macros to generate the translation unit, which is the input that compilers use to generate their output.
And that is clearly a function.
> If compilers were pure, there would be zero need for Makefile style build systems which stat files to see if they have changed.
That assertion makes no sense at all. Compilers take source code as input and output binaries. That's it. The feature you're mentioning is just a convenient trick to cut down build times by avoiding to compile source files that haven't changed. That's not the responsibility of the compiler. That's a function whose input is the source files' attributes and it's output is a DAG of files that is used to run a workflow where in each step a compiler is invoked to take a specific source file as input in order to generate a binary.
It's functions all the way down, but the compiler is just a layer in the middle.
> while Makefiles and their ilk are modeled as a dag is true, The only reason an external file/dag is actually necessary is due to impurity in the compilation process.
You have it entirely backwards: build systems exist because compilers are pure functions with specific and isolated responsibilities. Compilers take source code as input and generate binaries as output. That's it. And they are just a component in the whole build system, which is comprised of multiple tools that are designed as pure functions as well.
> I don't understand your point. A function doesn't cease to be a function if it sends it's output somewhere else.
I think here lies the miscommunication, I'm talking about pure functions, it doesn't cease to be a function, but it does cease to be a pure one if sending its output somewhere else is done by side-effect.
I guess there is pure and pure. Pure in the sense of no side-effects at all, such as for example writing to a file, and pure in the sense of not relying on state.
> I think here lies the miscommunication, I'm talking about pure functions, it doesn't cease to be a function, but it does cease to be a pure one if sending its output somewhere else is done by side-effect.
No, you're getting it entirely wrong. The input is the translation unit, the output is the binaries. If you understand what a compiler does and what's it's role in a build process then you'll understand it works as a pure function. There are no side-effects. Translation units in, binaries out.
I suggest you spend some time with a tiny C or C++ project trying to arrive at an executable by performing all build steps by hand instead of using a Makefile or any form of high-level build system.
> The only reason an external file/dag is actually necessary is due to impurity in the compilation process.
But files also make various pieces of compiler chain interoperable and allows me to define a DAG. That's exactly what make is so powerful and that's exactly what I'd hate to loose.
Modern compilers do a lot and understandably they are trying to avoid writing out some partially calculated state to disk (e.g. serializing and AST to disk between stages would be doing work twice). But moving everything into the process means your compiler becomes a walled garden.
You can see this happening in the javascript world. Very few people actually know what WebPack does. It's a giant black box with infinite number of switches and everything is "magic".
I totally agree with this and think it's a valid concern, It difficult to get the best of both worlds here within the confines of the unix process model.
The query style compiler isn't by necessity a single process. You could imagine an implementation based on the actor model or almost any other message passing system where the queries are driven externally. That the expedient way to do this is by stuffing everything into a giant process is regrettable.
> The only reason an external file/dag is actually necessary is due to impurity in the compilation process.
No compilation process can know about the parts of your project written in another language.
I didn't do a good job of explaining it, the thing is that if your compiler is truly pure, your source input is a node in a graph, the compiler is an edge, and the output is another node. Given 2 languages with pure compilers the entire compilation process inherently admits a DAG, rather than having to reconstruct the DAG in order to drive the compilation process.
> I didn't do a good job of explaining it, the thing is that if your compiler is truly pure, your source input is a node in a graph, the compiler is an edge, and the output is another node.
You've just described the role of a compiler in a build system.
> Given 2 languages with pure compilers the entire compilation process inherently admits a DAG
That's what you're getting wrong. Building is not the same thing as compiling. A software build is a workflow comprised of one or more jobs, where compiling is one of the many types of jobs that's performed. In fact, the process of generating a library or executable from source code is a multi-step process where compiling is only one of the many steps. You get a source code, which may be processed by a macro processor to generate the source code to be compiled, then the compiler generates binaries from that source code, which are then handed to the linker, etc etc etc.
2 replies →