Author here - very cool to see this get posted! Thank you @ivankra for adding this to https://github.com/ivankra/javascript-zoo and running those benchmarks, I really appreciate it!
This started as a hobby project that I've ended up putting a lot of time into over the last three years chasing completeness and performance.
I don't know if closing the gap on features with Boa and hardening for production use will also bloat the compilation size. Regardless, for passing 97% of the spec at this size is pretty impressive.
That covers the vast bulk of the difference. The ICU data is about 10.7MB in the source (boa/core/icu_provider) and may grow or shrink by some amount in the compiling.
I'm not saying it's all the difference, just the bulk.
There's a few reasons why svelte little executables with small library backings aren't possible anymore, and it isn't just ambient undefined "bloat". Unicode is a big one. Correct handling of unicode involves megabytes of tables and data that have to live somewhere, whether it's a linked library, compiled in, tables on disks, whatever. If a program touches text and it needs to handle it correctly rather than just passing it through, there's a minimum size for that now.
Brimstone does try to use the minimal set of Unicode data needed for the language itself. But I imagine much of the difference with Boa is because of Boa's support for the ECMA-402 Internationalization API (https://tc39.es/ecma402/).
Unicode is everywhere though. You'd think there'd be much greater availability of those tables and data and that people wouldn't need to bundle it in their executables.
I was currious to see what that data consisted of and aparently that's a lot of translations, like the name of all possible calendar formats in all possible languages, etc. This seems useless in the vast majority of use cases, including that of a JS interpreter. Looks to me like the typical output of a comitee that's looking too hard to extend its domain.
I was gonna say the last few percent might increase the size disproportionally as the last percent tend to do[0] but looks like boa passes fewer tests (~91%).
This is something I notice in small few-person or one-person projects. They don't have the resources to build complex architectures so the code ends up smaller, cleaner and easier to maintain.
The other way to look at it is that cooperation has an overhead.
[0]: The famous 80:20 rule. Or another claiming that each additional 9 in reliability (and presumably other aspects) takes the same amount of work.
Is that with any other size optimizations? I think by default, most of them (like codegen-units=1, remove panic handling, etc) are tuned for performance, not binary size, so might want to look into if the results are different if you change them.
Stripping can save a huge amount of binary size, there’s lots of formatting code added for println! and family, stacktrace printing, etc. However, you lose those niceties if stripping at that level.
It's impressively compliant, considering it's just a one man project! Almost as fully featured as Boa, plus or minus a few things. And generally faster too, almost double the speed of Boa on some benchmarks.
First time seeing a few of the engines listed here - based on this table I'm surprised Samsung's Escargot hasn't gotten more attention. LGPL, 100% ES2016+ compliance, top 10 perf ranking, 25% the size of V8 & only 318 Github stars.
A quick HN search shows 0 comments for Escargot - is there some hidden problem with this engine not covered in this table?
It carries some weight, very roughly in the direction of formal verification. Since (assuming there isn't any unsafe), a specific class of bugs are guaranteed to not happen.
However, this repo seems like it uses quite a bit of unsafe, by their own admission.
Because many people, including myself, have been consistently experiencing better quality from Rust-written software.
Maybe it's the type of language that attracts people who are interested in getting the details right.
Or maybe the qualities of the language mean if a project manages to reachthe production stage, it will be better than an alternative that would reach the production stage because the minimal level of quality and checks required are better.
Or maybe it's because it comes with very little friction to install and use the software, because Rust software usually comes with a bunch of binaries from all popular platforms, and often, installers.
Or maybe the ecosystem is just very good.
Or maybe it's all together, and something more.
Doesn't matter.
The fact is, I did have a better experience with software written in rust that in Python, JS or even Go or Java.
And I appreciate knowing the software is not written in C or C++, and potentially contains problems regarding security, segfaults, and encoding that are going to bite me down the road, as it's been common in the last 30 years.
So "written in rust" is a thing I want to know, as it will make me more likely to try said software.
I think the idea is like: it took extra work 'cause Rust makes you be so explicit about allocations and types, but it's also probably faster/more reliable because that work was done.
Of course at the end of the day it's just marketing and doesn't necessarily mean anything. In my experience the average piece of Rust software does seem to be of higher quality though..
Even forgetting the memory safety and async safety guarantees, the language design produces lower defect code by a wide margin. Google and other orgs have written papers about this.
There are no exceptions. There are no nulls. You're encouraged to return explicit errors. No weird error flags or booleans or unexpected ways of handling abnormal behaviors. It's all standardized. Then the language syntax makes it easy to handle and super ergonomic and pleasurable. It's nice to handle errors in Rust. Fully first class.
Result<T,E>, Option<T>, match, if let, if let Ok, if let Some, while let, `?`, map, map_err, ok_or, ok_or_else, etc. etc. It's all super ergonomic. The language makes this one of its chief concerns, and writing idiomatic Rust encourages you to handle errors smartly.
Because errors were so well thought out, you write fewer bugs.
Finally, the way the language makes you manage scope, it's almost impossible to write complicated nesting or difficult to follow logic. Hard to describe this one unless you have experience writing Rust, but it's a big contributor to high quality code.
Rust code is highly readable and easy to reason about (once you learn the syntax). There are no surprises with Rust. It's written simply and straightforwardly and does what it says on the tin.
In the case of libraries, this distinction is important; we've set up our computing infrastructure in a way replete with barriers which and drive repeated efforts in isolation. Therefore, if a library is written in Rust, it suggests that I can use it in my Rust program clear of a conspicuous barrier type.
For an application, service, etc like this... it is not relevant.
Many megacorps provide value to users. For example Google and Apple are used by maybe 75% of humanity. Google in particular appears to have given back into the ecosystem (often to Google's detriment). It isn't as binary as you make it to be.
The obvious use-case for unsafe is to implement alternative memory regimes that don’t exist in rust already, so you can write safe abstractions over them.
Rust doesn’t have the kind of high performance garbage collection you’d want for this, so starting with unsafe makes perfect sense to me. Hopefully they keep the unsafe layer small to minimise mistakes, but it seems reasonable to me.
Even using something as simple as Vec means using `unsafe` code (from the std library). The idea isn't to have no `unsafe` code (which is impossible). It's to limit it to small sections of your code that are much more easily verifiable.
For some use cases, that means that "user code" can have no `unsafe`. But implementing a GC is very much not one of those.
Rust also has some nice language features. Even unsafe rust doesn't have the huge "undefined behaviour" surface that languages like C++ still contain.
If I were to write a toy JS runtime in Rust, I'd try to make it as safe as possible and deal with unsafe only when optimization starts to become necessary, but it's not like that's the only way to use Rust.
That’s the philosophy. Use the less constrained (but still somewhat constrained and borrow checked) unsafe to wrap/build the low level stuff, and expose a safe public API. That way you limit the exposure of human errors in unsafe code to a few key parts that can be well understood and tested.
The whole point of unsafe is to be able to circumvent the guardrails where the developer knows something the compiler isn't (yet) smart enough to understand. It's likely that implementing a high-performance GC runs afoul of quite a few of those edge cases.
Rust WAS really nice before it got mangled with syntax like we never seen before. Graydon did not imagine rust as what it is today. Rust core wo. async is ok, but in practice rust projects tend to have hundreds of deps and really slow compiles. Its just like javascript with npm.
If you count Rc/Arc as garbage collection you should count RAII + The Borrow Checker (i.e. all safe rust) as garbage collection too IMHO. It collects garbage just as automatically - it just does so extremely efficiently.
That said I tend to count neither. Garbage collection to me suggests you have something going around collecting it, not just you detect you're done with something when you're done with it and deal with it yourself.
I still wouldn't call it GC in that case. It's pretty much exactly the same as std::shared_ptr in C++, and we don't usually call that GC. I don't know about the academic definition, but I draw the line at a cycle collector. (So e.g. Python is GC'd, but Rust/C++/Swift are not.)
Author here - very cool to see this get posted! Thank you @ivankra for adding this to https://github.com/ivankra/javascript-zoo and running those benchmarks, I really appreciate it!
This started as a hobby project that I've ended up putting a lot of time into over the last three years chasing completeness and performance.
Just a small comparison, compiled for release:
Boa: 23M Brimstone: 6.3M
I don't know if closing the gap on features with Boa and hardening for production use will also bloat the compilation size. Regardless, for passing 97% of the spec at this size is pretty impressive.
It looks like Boa has Unicode tables compiled inside of itself: https://github.com/boa-dev/boa/tree/main/core/icu_provider
Brimstone does not appear to.
That covers the vast bulk of the difference. The ICU data is about 10.7MB in the source (boa/core/icu_provider) and may grow or shrink by some amount in the compiling.
I'm not saying it's all the difference, just the bulk.
There's a few reasons why svelte little executables with small library backings aren't possible anymore, and it isn't just ambient undefined "bloat". Unicode is a big one. Correct handling of unicode involves megabytes of tables and data that have to live somewhere, whether it's a linked library, compiled in, tables on disks, whatever. If a program touches text and it needs to handle it correctly rather than just passing it through, there's a minimum size for that now.
Brimstone does embed Unicode tables, but a smaller set than Boa embeds: https://github.com/Hans-Halverson/brimstone/tree/master/icu.
Brimstone does try to use the minimal set of Unicode data needed for the language itself. But I imagine much of the difference with Boa is because of Boa's support for the ECMA-402 Internationalization API (https://tc39.es/ecma402/).
2 replies →
Unicode is everywhere though. You'd think there'd be much greater availability of those tables and data and that people wouldn't need to bundle it in their executables.
2 replies →
I was currious to see what that data consisted of and aparently that's a lot of translations, like the name of all possible calendar formats in all possible languages, etc. This seems useless in the vast majority of use cases, including that of a JS interpreter. Looks to me like the typical output of a comitee that's looking too hard to extend its domain.
Disclaimer: I never liked unicode specs.
9 replies →
If someone builds, say, a Korean website and needs sort(), does the ICU monolith handle 100% of the common cases?
(Or substitute for Korean the language that has the largest amount of "stuff" in the ICU monolith.)
1 reply →
As well-defined as Unicode is, surprising that no one has tried to replace ICU with a better mousetrap.
Not to say ICU isn’t a nice bit of engineering. The table builds in particular I recall having some great hacks.
1 reply →
I was gonna say the last few percent might increase the size disproportionally as the last percent tend to do[0] but looks like boa passes fewer tests (~91%).
This is something I notice in small few-person or one-person projects. They don't have the resources to build complex architectures so the code ends up smaller, cleaner and easier to maintain.
The other way to look at it is that cooperation has an overhead.
[0]: The famous 80:20 rule. Or another claiming that each additional 9 in reliability (and presumably other aspects) takes the same amount of work.
Is that with any other size optimizations? I think by default, most of them (like codegen-units=1, remove panic handling, etc) are tuned for performance, not binary size, so might want to look into if the results are different if you change them.
Stripping can save a huge amount of binary size, there’s lots of formatting code added for println! and family, stacktrace printing, etc. However, you lose those niceties if stripping at that level.
I only ran both with `cargo build --release`
Could you compare it with Boa? It is written in Rust too.
https://github.com/boa-dev/boa
I have some benchmark results here: https://ivankra.github.io/javascript-zoo/?v8=true
It's impressively compliant, considering it's just a one man project! Almost as fully featured as Boa, plus or minus a few things. And generally faster too, almost double the speed of Boa on some benchmarks.
First time seeing a few of the engines listed here - based on this table I'm surprised Samsung's Escargot hasn't gotten more attention. LGPL, 100% ES2016+ compliance, top 10 perf ranking, 25% the size of V8 & only 318 Github stars.
A quick HN search shows 0 comments for Escargot - is there some hidden problem with this engine not covered in this table?
1 reply →
Surprised at the lack of a license though.
Interesting. Hermes and QuickJS both come out looking very good in these (in terms of performance vs. binary size)
Why is stuff written in rust always promoted as "written in rust" like its some magic thing?
I'm old enough to have seen the "written in lisp", "written in ruby", "written in javascript" eras, among others. It's natural.
It carries some weight, very roughly in the direction of formal verification. Since (assuming there isn't any unsafe), a specific class of bugs are guaranteed to not happen.
However, this repo seems like it uses quite a bit of unsafe, by their own admission.
There's a lot of unsafe in this at least. hard to be both safe and fast.
1 reply →
I mean if i care about safety that much i would just write the damn thing in ATS. Rust has too many escape hatches to be safe anyway.
1 reply →
One simple reason: for those of us invested in the Rust ecosystem it helps us spot new projects we could consider using.
Because many people, including myself, have been consistently experiencing better quality from Rust-written software.
Maybe it's the type of language that attracts people who are interested in getting the details right.
Or maybe the qualities of the language mean if a project manages to reachthe production stage, it will be better than an alternative that would reach the production stage because the minimal level of quality and checks required are better.
Or maybe it's because it comes with very little friction to install and use the software, because Rust software usually comes with a bunch of binaries from all popular platforms, and often, installers.
Or maybe the ecosystem is just very good.
Or maybe it's all together, and something more.
Doesn't matter.
The fact is, I did have a better experience with software written in rust that in Python, JS or even Go or Java.
And I appreciate knowing the software is not written in C or C++, and potentially contains problems regarding security, segfaults, and encoding that are going to bite me down the road, as it's been common in the last 30 years.
So "written in rust" is a thing I want to know, as it will make me more likely to try said software.
I think the idea is like: it took extra work 'cause Rust makes you be so explicit about allocations and types, but it's also probably faster/more reliable because that work was done.
Of course at the end of the day it's just marketing and doesn't necessarily mean anything. In my experience the average piece of Rust software does seem to be of higher quality though..
Even forgetting the memory safety and async safety guarantees, the language design produces lower defect code by a wide margin. Google and other orgs have written papers about this.
There are no exceptions. There are no nulls. You're encouraged to return explicit errors. No weird error flags or booleans or unexpected ways of handling abnormal behaviors. It's all standardized. Then the language syntax makes it easy to handle and super ergonomic and pleasurable. It's nice to handle errors in Rust. Fully first class.
Result<T,E>, Option<T>, match, if let, if let Ok, if let Some, while let, `?`, map, map_err, ok_or, ok_or_else, etc. etc. It's all super ergonomic. The language makes this one of its chief concerns, and writing idiomatic Rust encourages you to handle errors smartly.
Because errors were so well thought out, you write fewer bugs.
Finally, the way the language makes you manage scope, it's almost impossible to write complicated nesting or difficult to follow logic. Hard to describe this one unless you have experience writing Rust, but it's a big contributor to high quality code.
Rust code is highly readable and easy to reason about (once you learn the syntax). There are no surprises with Rust. It's written simply and straightforwardly and does what it says on the tin.
7 replies →
The people that complain about projects saying they were written in rust are far more annoying than the projects themselves at this point.
In the case of libraries, this distinction is important; we've set up our computing infrastructure in a way replete with barriers which and drive repeated efforts in isolation. Therefore, if a library is written in Rust, it suggests that I can use it in my Rust program clear of a conspicuous barrier type.
For an application, service, etc like this... it is not relevant.
It is becoming meme like Arch - It is written in Rust btw.
Usually people think Rust = fast, so "Written in Rust" might imply it runs fast.
Yes. Getting really odd...
When it's a library it's fairly important what it's written (or at least written for)
Usually if a project isn't using unsafe, it means memory bugs are not a thing while still promising non-GC speeds.
However, this project is using a ton of unsafe (partly to offer GC behavior for js): https://github.com/search?q=repo%3AHans-Halverson%2Fbrimston...
[flagged]
The virtue of being memory safe is probably more a value than a virtue.
Also, without associated social virtue signaling, what do you think is wrong with signaling?
"Compacting garbage collector, written in very unsafe Rust" got me cracking.
Sorry for the offtop, but I really miss the cracktros. Imagine having Ikari intro before you boot into your OS.
Sorry also for being offtopic, but "cracking" in this case most likely refers to cracking [with laughter].
1 reply →
Plymouth let's you do it on Linux without hacking around like osx or windows.
There's no license I can see
That was an oversight, this is now under the MIT license!
Great to see more projects not opting into licenses which permit megacorp exploitation by default
Many megacorps provide value to users. For example Google and Apple are used by maybe 75% of humanity. Google in particular appears to have given back into the ecosystem (often to Google's detriment). It isn't as binary as you make it to be.
Memory safety is one of Rust’s biggest selling points. It’s a bit baffling that this engine would choose to implement unsafe garbage collection.
The obvious use-case for unsafe is to implement alternative memory regimes that don’t exist in rust already, so you can write safe abstractions over them.
Rust doesn’t have the kind of high performance garbage collection you’d want for this, so starting with unsafe makes perfect sense to me. Hopefully they keep the unsafe layer small to minimise mistakes, but it seems reasonable to me.
I'm curious if it can be done in Rust entirely though. Maybe some assembly instructions are required e.g. for trapping or setting memory fences.
3 replies →
Even using something as simple as Vec means using `unsafe` code (from the std library). The idea isn't to have no `unsafe` code (which is impossible). It's to limit it to small sections of your code that are much more easily verifiable.
For some use cases, that means that "user code" can have no `unsafe`. But implementing a GC is very much not one of those.
Rust also has some nice language features. Even unsafe rust doesn't have the huge "undefined behaviour" surface that languages like C++ still contain.
If I were to write a toy JS runtime in Rust, I'd try to make it as safe as possible and deal with unsafe only when optimization starts to become necessary, but it's not like that's the only way to use Rust.
That’s the philosophy. Use the less constrained (but still somewhat constrained and borrow checked) unsafe to wrap/build the low level stuff, and expose a safe public API. That way you limit the exposure of human errors in unsafe code to a few key parts that can be well understood and tested.
The whole point of unsafe is to be able to circumvent the guardrails where the developer knows something the compiler isn't (yet) smart enough to understand. It's likely that implementing a high-performance GC runs afoul of quite a few of those edge cases.
IMO the memory safety aspect is overblown by enthusiasts and purists. Rust is an overall nice fast imperative language.
Rust WAS really nice before it got mangled with syntax like we never seen before. Graydon did not imagine rust as what it is today. Rust core wo. async is ok, but in practice rust projects tend to have hundreds of deps and really slow compiles. Its just like javascript with npm.
5 replies →
how does this compare to existing JS engines?
If you look at bench or conformance here, https://github.com/ivankra/javascript-zoo?tab=readme-ov-file..., you can get an idea of how it compares in some ways
You can embed this one in your Rust programs. No linking to C/C++. All native Rust.
That little 40 mb single binary server you wrote can now be scripted in JavaScript.
This is frankly awesome, and now there are multiple Rust-native JavaScript engines. And they both look super ergonomic.
[flagged]
> who in the fuck would write a garbage collector using garbage collected Rust?
Rust is not garbage collected unless you explicitly opt into using Rc/Arc
If you count Rc/Arc as garbage collection you should count RAII + The Borrow Checker (i.e. all safe rust) as garbage collection too IMHO. It collects garbage just as automatically - it just does so extremely efficiently.
That said I tend to count neither. Garbage collection to me suggests you have something going around collecting it, not just you detect you're done with something when you're done with it and deal with it yourself.
I still wouldn't call it GC in that case. It's pretty much exactly the same as std::shared_ptr in C++, and we don't usually call that GC. I don't know about the academic definition, but I draw the line at a cycle collector. (So e.g. Python is GC'd, but Rust/C++/Swift are not.)
1 reply →
Rust is not garbage collected though.
Yes, but safe Rust enforces strict borrow checking with tracing, reference counting, etc. which would be inefficient for GC implementation.
1 reply →
[flagged]
Love the name of the executable ;) For my taste it just sounds right. BS as in Bullsh#t :)