← Back to context

Comment by nine_k

9 months ago

> To avoid using an existing toolchain, we need some way to be able to compile a GCC version without C. We can use a less well-featured compiler, TCC, to do this. And so forth, until we get to a fairly primitive C compiler written in assembly, cc_x86

I imagined going a slightly different route.

A minimal Forth can be written in assembly and in itself. It suffices to write a console using a serial port, a primitive FAT filesystem to access SPI Flash, and maybe even an interface to USB mass storage.

Forth is not very easy to audit, but likely still easier than raw assembly.

One can write a C compiler right on top of that, sufficient to compile TCC.

Alternatively, a simple Lisp can be written on top of the Forth, it's much simpler than writing it in assembly. Using the Lisp, a much more understandable and auditable C compiler can be written.

Much of the Forth, all of the Lisp, and much of the C compiler (except code generation) would be portable and reusable across multiple architectures, without the need to audit them fully every time.

The fun part here is (potentially) not using QEMU and cross-compilers, and running everything on a sufficiently powerful target hardware, for the extra paranoid.

This[0] repo which has the stage0 code used by GPs linked repo includes some explanation in the README about why not Forth or Lisp - in summary it turned out to be harder than that, at least for the authors of that project.

0. https://github.com/oriansj/stage0

  • This is pretty interesting. But the author sets a much more difficult goal:

    > the goal of creating a bootstrapping path to a C compiler capable of compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.

    Iä! I would not limit the size drastically, but would emphasize simplicity and legibility, in hopes that achieving correctness this way would be easier.

i think this is a good approach, but almost everyone who has tried it has gotten bogged down in the usual problems with forth, where you get distracted by writing clever code instead of writing fairly boring, straightforward code that gets the job done in a dumb way. and where things fail in difficult-to-debug ways, so you spend a lot of time debugging. virgil dupras's duskos is the only candidate for an exception to this rule. hopefully you will write a second one!

  • I see. I would keep the Forth code to a minimum necessary to implement simple things, with an emphasis on things being understandable and hence auditable. Smart code is good; clever, bad.