← Back to context

Comment by xonix

3 years ago

This reminded me the idea of compilers bootstrapping (https://bellard.org/tcc/), and then with TCC you can go forward to GCC and so on.

Did you read about the guix full source bootstrap the other day? They've shrunk the bootstrap seed down to a 357-byte program:

https://guix.gnu.org/blog/2023/the-full-source-bootstrap-bui...

  • The bootstrap seed, https://github.com/oriansj/bootstrap-seeds/blob/master/POSIX..., is a tiny interpreter that takes a much larger program written in a special-purpose, bytecode-based language. This proceeds in turn once or twice more--special purpose program generating another interpreter for another special-purpose language--until you end up with a minimal Scheme interpreter, which then can be used to execute a C compiler program.

    All of this is incredible work, but a minimal C-subset compiler in under 512 bytes of x86 assembly seems like a unique achievement as it includes non-trivial parsing and linking phases not required for that hex0 interpreter.

  • From that article:

    > Even more recently (2018), the GNU C Library glibc-2.28 adds Python as a build requirement

    I’m surprised they went with Python and not GNU software (e.g. Guile).

    Edit: clicking through the link it sounds like this might be intended to replace other accumulated dependencies (Perl?) and stop supporting old versions of Python.

  • Is Nix incorporating this work as well? I fully support and want Guix to survive, but it seems well behind in mind share, and I would love if Nix could become similarly repeatable and reproducible.

I’ve considered that. I’ve long been interested in bootstrapping. Lots of unpublished projects in the archive that I should also write up.

I considered writing a self-hosting compiler. It should be doable without too much work. I was going to do one in BarelyC originally.

SectorC -> TCC -> GCC sounds fun. C all the way down, haha

Perhaps my favorite thing in all of computing is the authors of Lisp bootstrapping by writing an (academic) interpreter for Lisp in Lisp, and then realizing they could just compile that by hand.

  • Paper?

    I once read a lisp paper that kept defining lisp primitives in terms of simpler lisp primitives until they had only 4-10 primitives. Then the paper started defining those in terms of assembly language (but with parens). In the final step, they just removed the parens. It was so elegant. I can't find the paper though :( Any ideas?

    • The paper is the original paper on LISP by McCarthy "Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I". It has the original definition of LISP in LISP.

The TCC step seems unnecessary. If you've got a SectorC C compiler sufficient to compile TCC, it can probably compile the bootstrap compiler used in GCC's 3 stage build process. The bigger issue is probably getting the ancillary tools up and running (a shell, make, coreutils, ...)

https://gcc.gnu.org/install/build.html

  • The method described in the page you reference, does not exclude the possibility that the compiler you start with, contains malicious code, and recognizes that it is compiling the GCC code base. It is not true bootstrapping as described in [0]. If you read through [1] you will see that even getting a fairly recent version of TCC to compile is not that simple.

    [0] https://bootstrappable.org/ [1] https://bootstrapping.miraheze.org/wiki/Live-bootstrap

    • At that level of a paranoia, there's no guarantee that the CPU doesn't contain a malicious method for detecting and modifying compiler builds, so you better build your own CPU out of TTL chips ;)