SectorC: A C Compiler in 512 bytes (2023)

13 hours ago (xorvoid.com)

If this implementation had existed in the 1980s, the C standard would have a rule that different tokens hashing to the same 16-bit value invoke undefined behavior, and optimizing compilers in the 2000s would simply optimize such tokens away to a no-op. ;)

I may be the author.. enjoy! It was an absolute blast making this!

Oh, it looks like my X86-16 boot sector C compiler that I made recently [1]. Writing boot sector games has a nostalgic magic to it, when programming was actually fun and showed off your skills. It's a shame that the AI era has terribly devalued these projects.

[1] https://github.com/Mati365/ts-c-compiler

  • > when programming was actually fun and showed off your skills

    Oh no. Now more people are able to do what I do. I'm not special anymore.

    • Seems like this is facetious but to me, “I’m not special” is a pretty valid thing to be sad about.

There seems to be a good amount of interest for a boot sector compiler!!

If you're running on Linux, adjust the qemu call to use alsa rather than coreaudio.

I generated a pull request for this on Github. If the author is happy enough with my verbose shell scripting style :-) it might get included.

Compare that to the C compiler in 100,000 lines written by Claude in two weeks for $20,000 (I think was posted on HN just yesterday)

  • It's a fun comparison, but with the notable difference that that one can compile the Linux kernel and generate code for multiple different architectures, while this one can only compile a small proportion of valid C. It's a great project, but it's not so much a C compiler, as a compiler for a subset of C that allows all programs this compiler can compile to also be compiled by an actual C compiler, but not vice versa.

The way hashing is used for tokens and for making a pseudo symbol table is such an elegant idea.

  • I think the same. Really nice project and good trick with hashing tokens.

    PS. There left 21 bytes (21 * 0x00 - from 0x01e0 to 0x01fd). Maybe something can be packed there ;)

This is so cool!

Fun fact, Tiny C Compiler was derived from such a C compiler submitted to the the International Obfuscated C Code Contest.

https://www.ioccc.org/2001/bellard/index.html

Nice, now you can dd it to your boot sector and ... Wait, it is 2026, there are 1000 ways of booting and memory mapping on so-called unified ARM architecture @,@

C-subset, to be precise; but microcomputer C compilers were in the tens of KB range, for one that can actually compile real C.

> I wrote a fairly straight-forward and minimalist lexer and it took >150 lines of C code

was it supposed to be "<150"?

  • They're saying the naive implementation was more than 150 lines of C code (300-450 bytes), i.e. too big.