Comment by densh
2 days ago
Hey, as someone who spent a few years reimplementing another language trying to decouple it from JVM (Scala JVM -> Scala Native), some pitfalls to avoid:
- Don't try to provide backwards compatible subset of JVM APIs. While this might seem tempting to support very important library X with just a bit of work, I'd rather see new APIs that are only possible with your language / runtime. Otherwise you might end up stuck in never-ending stream of requests to add one more JVM feature to get yet another library from the original JVM language running. Focus on providing your own unique APIs or bindings to native projects that might not be easy to do elsewhere.
- Don't implement your own GC, just use mmtk [1]. It takes a really long time to implement something competitive, and mmtk already has an extensible and pluggable GC design that gets some of the best performance available today [2] without much effort on your end.
- Don't underestimate complexity and importance of multi-threading and concurrency. Try to think of supporting some form of it early or you might get stuck single threaded world forever (see CPython). Maybe you don't do shared memory multi threading and then it could be quite easy to implement (as in erlang). No shared memory also means no shared heap, which makes GCs's life much easier.
- Don't spend too much time benchmarking and optimizing single threaded performance against JVM as performance baseline. If you don't have a compelling use case (usually due to unique libraries), the performance might not matter enough for users to migrate to your language. When you do optimize, I'd rather see fast startup, interactive environment (think V8), over slow startup but eventually efficient after super long warmup (like jvm).
I see that jank is already doing at least some of the things right based on the docs, so this message might be more of a dump of mistakes I've done previously in this space.
> Don't try to provide backwards compatible subset of JVM APIs.
Yeah, jank doesn't much with JVM APIs or the JVM at all. We have our own implementation of the compiler and runtime. It has similarities to Clojure's design, only because the object model somewhat demands that.
> Don't implement your own GC, just use mmtk [1].
Yep, already the plan. Currently using Boehm, but MMTK is the next upgrade.
> Don't underestimate complexity and importance of multi-threading and concurrency.
Clojure aids this in having STM, immutable data structures, etc. However, there are some key synchronization points and I do need to audit all of them. jank doesn't have multi-threading support yet, but we will _not_ go the way of Python. jank is Clojure and Clojurists expect sane multi-threading.
> Don't spend too much time benchmarking and optimizing single threaded performance against JVM as performance baseline.
This year, not much optimization has been done at all. I did some necessary benchmarking early on, to aid in some design decisions, but I follow this mantra:
1. Make it work
2. Make it correct
3. Make it fast
I'm currently on step 2 for most of jank. Thanks for sharing the advice!
Very cool project and I think you are doing it right. Best of luck with getting it off the ground!