← Back to context

Comment by nethunters

5 years ago

Thanks for clearing that up.

Which languages or implementations of languages directly interpret the AST without the intermediary bytecode compilation?

I know Python, Java and JavaScript (V8 and SpiderMonkey) all compile to bytecode first probably to speed up subsequent runs and run some optimisations on the bytecode.

What other benefits are there to compiling to bytecode first vs directly interpreting the AST?

One major benefit of compiling to bytecode first is that bytecode is a more convenient shared source of truth.

For example, SpiderMonkey has two interpreter and two compiler tiers. The output of the parser is bytecode. We can interpret it immediately, and it's a convenient input to the compilers. It also simplifies transitions between tiers: for example, when we bail out of Warp, we resume at a specific bytecode offset in the baseline interpreter.

I'm not sure how you would resume at a particular point in the execution of an AST-based interpreter without agreeing on a linearized traversal of the AST, which is 90% of the way to bytecode.

Perl is a tree based interpreter. It's not exactly an AST anymore but the time it's being executed, but close enough.

If you compile to AST and walk that then your portability is at the source level; you have to send the source over the wire; each target then needs a parser+walker and each target needs to parse+walk. If you compile to bytecode you can send bytecode over the wire and then simply interpret that bytecode.

Portability. Say I wanted to make language X run on all platforms, but I didn't actually care about compiling it on all platforms. I can just write a relatively simple VM for each platform. This is one of the reasons Java was and still kinda is so ubiquitous

  • Wouldn't writing an interpreter for each platform be less work and achieve the same goal as writing a VM for each platform?

    Edit: ^Aside from being able to execute the bytecode on any platform

    • Why would it be less work? The interpreter will need to implement whatever operations a VM can perform, so a priori it's at least as much work. Bonus, if you can bootstrap the source->bytecode process, then you only need to write (and compile) that once to get a full-fledged interpreter on every host with a VM

    • As others mentioned, source code should be distributed that way, and I think creating a simple VM is easier than a simple language parser. But of course, an optimizing one can be really quite complex in both cases.

Gambit Scheme, CLISP, CMUCL are capable of interpreting ASTs, and I believe (although I'm not 100% sure) that this is the case for Lispworks and Allegro CL as well. Also, Ruby until version 1.8.