← Back to context

Comment by lambdaone

7 days ago

Amazing, and you could see how this could be translated into perhaps a couple of thousand lines of assembly code to boostrap Scheme almost from the bare metal, similar to the early IBM 709 LISP.

A thought: I wonder if an LLM would be up to the job of writing the assembly code from this?

This is the path that the GNU Mes and Guix folks are taking: https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...

(sans LLMs -- I believe they have a Scheme (GNU Mes) that can be compiled from their 357 byte bootloader, and that Scheme can run a C compiler that can compile tinycc, and I think there's a path from tinycc to compiling gcc. I'm not sure how far it can get after that -- this blog post[1] implies that you still need a binary of guile to run Gash, which is a POSIX shell written in Scheme. I'm guessing the plan is to either simplify Gash or complexify Mes to be able to remove Guile as a dependency.

[1] https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...

You could just use a C compiler to write the assembly code from this, and it'd be far less buggy.

Before there were LLMs, there were about 65 years of other program-writing-programs to save labor.

  • Generally speaking, program-writing-programs (when they are not either buggy, or being updated to a new version) predictably write the same thing from the same specification according to rules.

    LLMs do not replace program-writing-programs; they should be used to work with program-writing-programs, not as a replacement.

    E.g. I wouldn't use an LLM to convert a grammar to LR parsing tables, but perhaps to get help with the grammar file.

A C compiler can output fairly readable code if you turn off optimizations, and it's definitely not going to take thousands of lines to do this in modern assembly. It may be only just barely a thousand lines to do this in aarch64, and the LLM can probably do it.

From what I've seen the LLM do it can definitely enhance these programs if you know what to ask, and it can explain how any piece this code works. It may even be able to add garbage collection to the evaluator since the root registers are explicit, and the evaluator only acts on static memory.

If your Scheme were a native compiler rather than an interpreter (i.e. like Chez) and written in (a constrained subset of) Scheme, then you could use an itty-bitty Scheme interpreter written like this to bootstrap the core of your native Scheme compiler so that the interpreted version of it could compile itself to native binary.

Given that this was the path that one of Dr. Dybvig's post-doc acolytes, Dr. Michael Ashley, taught us in his Scheme-based compiler class, I must guess that this was also Kent's path and that that was the intent for this little interpreter. I suppose I should (finally) take the time to read his dissertation.

> I wonder if an LLM would be up to the job of writing the assembly code from this?

I could see a compiler doing that.

  • I'm quite aware of the existence of compilers, having worked on bootstrapping a production LISP compiler in the past. My point being that this would be an interesting experiment to do this "naïvely", given how close C is to (for example) PDP-11 assembly code.

> I wonder if an LLM would be up to the job of writing the assembly code from this

Why ask a cluster of GPU's to do something any single CPU from the last 30 years can accomplish?

  • I ask this question every day, and I think the answer is the same as the answer to the question "why are you doing this with blockchain?"

If it's speed you're after, much more analysis (and thus code) is needed. See V8.