Comment by lordofgibbons
1 day ago
The more your compiler does for you at build time, the longer it will take to build, it's that simple.
Go has sub-second build times even on massive code-bases. Why? because it doesn't do a lot at build time. It has a simple module system, (relatively) simple type system, and leaves a whole bunch of stuff be handled by the GC at runtime. It's great for its intended use case.
When you have things like macros, advanced type systems, and want robustness guarantees at build time.. then you have to pay for that.
I think this is mostly a myth. If you look at Rust compiler benchmarks, while typechecking isn't _free_ it's also not the bottleneck.
A big reason that amalgamation builds of C and C++ can absolutely fly is because they aren't reparsing headers and generating exactly one object file so the linker has no work to do.
Once you add static linking to the toolchain (in all of its forms) things get really fucking slow.
Codegen is also a problem. Rust tends to generate a lot more code than C or C++, so while the compiler is done doing most of its typechecking work, the backend and assembler has a lot of things to chuck through.
The meme that static linking is slow or produces anything other than the best executables is demonstrably false and the result of surprisingly sinister agendas. Get out readelf and nm and PS sometime and do the arithematic: most programs don't link much of glibc (and its static link is broken by design, musl is better at just about everything). Matt Godbolt has a great talk about how dynamic linking actually works that should give anyone pause.
DLLs got their start when early windowing systems didn't quite fit on the workstations of the era in the late 80s / early 90s.
In about 4 minutes both Microsoft and GNU were like, "let me get this straight, it will never work on another system and I can silently change it whenever I want?" Debian went along because it gives distro maintainers degrees of freedom they like and don't bear the costs of.
Fast forward 30 years and Docker is too profitable a problem to fix by the simple expedient of calling a stable kernel ABI on anything, and don't even get me started on how penetrated everything but libressl and libsodium are. Protip: TLS is popular with the establishment because even Wireshark requires special settings and privileges for a user to see their own traffic, security patches my ass. eBPF is easier.
Dynamic linking moves control from users to vendors and governments at ruinous cost in performance, props up bloated industries like the cloud compute and Docker industrial complex, and should die in a fire.
Don't take my word for it, swing by cat-v.org sometimes and see what the authors of Unix have to say about it.
I'll save the rant about how rustc somehow manages to be slower than clang++ and clang-tidy combined for another day.
CppCon 2018: Matt Godbolt “The Bits Between the Bits: How We Get to main()"
https://www.youtube.com/watch?v=dOfucXtyEsU
I think you're confused about my comment and this thread - I'm talking about build times.
1 reply →
…surprisingly sinister agendas.
…
Dynamic linking moves control from users to vendors and governments at ruinous cost in performance, props up bloated industries...
This is ridiculous. Not everything is a conspiracy!
3 replies →
Not only does it generate more code, the initially generated code before optimizations is also often worse. For example, heavy use of iterators means a ton of generics being instantiated and a ton of call code for setting up and tearing down call frames. This gets heavily inlined and flattened out, so in the end it's extremely well-optimized, but it's a lot of work for the compiler. Writing it all out classically with for loops and ifs is possible, but it's harder to read.
For loops are sugar around an Iterator instantiation:
translates to roughly
> Once you add static linking to the toolchain (in all of its forms) things get really fucking slow.
Could you expand on that, please? Every time you run dynmically linked program, it is linked at runtime. (unless it explicitly avoids linking unneccessary stuff by dlopening things lazily; which pretty much never happens). If it is fine to link on every program launch, linking at build time should not be a problem at all.
If you want to have link time optimization, that's another story. But you absolutely don't have to do that if you care about build speed.
The swift compiler is definitely bottle necked by type checking. For example, as a language requirement, generic types are left more or less in-tact after compilation. They are type checked independent of what is happening. This is unlike C++ templates which are effectively copy-pasting the resolved type with the generic for every occurrence of type resolution.
This has tradeoffs: increased ABI stability at the cost of longer compile times.
> This has tradeoffs: increased ABI stability at the cost of longer compile times.
Nah. Slow type checking in Swift is primarily caused by the fact that functions and operators can be overloaded on type.
Separately-compiled generics don't introduce any algorithmic complexity and are actually good for compile time, because you don't have to re-type check every template expansion more than once.
2 replies →
A lot can be done by the programmer to mitigate slow builds in Swift. Breaking up long expressions into smaller ones and using explicit types where type inference is expensive for example.
I’d like to see tooling for this to pinpoint bottlenecks - it’s not always obvious what’s making builds slow.
3 replies →
>This is unlike C++ templates which are effectively copy-pasting the resolved type with the generic for every occurrence of type resolution.
Even this can lead to unworkable compile times, to the point that code is rewritten.
>Codegen is also a problem. Rust tends to generate a lot more code than C or C++
Wouldn't you say a lot of that comes from the macros and (by way of monomorphisation) the type system?
Modern C++ in particular does a lot of similar, albeit not identical, codegen due to its extensive metaprogramming facilities. (C is, of course, dead simple.) I've never looked into it too much but anecdotally Rust does seem to generate significantly more code than C++ in cases where I would intuitively expect the codegen to be similar. For whatever reason, the "in theory" doesn't translate to "in practice" reliably.
I suspect this leaks into both compile-time and run-time costs.
Go is static by default and still fast as hell
Delphi is static by default and incredibly fast too.
1 reply →
That the type system is responsible for rust's slow builds is a common and enduring myth. `cargo check` (which just does typechecking) is actually usually pretty fast. Most of the build time is spent in the code generation phase. Some macros do cause problems as you mention, since the code that contains the macro must be compiled before the code that uses it, so they reduce parallelism.
> Most of the build time is spent in the code generation phase.
I can believe that, but even so it's caused by the type system monomorphising everything. When it use qsort from libc, you are using per-compiled code from a library. When you use slice::sort(), you get custom assembler compiled to suit your application. Thus, there is a lot more code generation going on, and that is caused by the tradeoffs they've made with the type system.
Rusts approach give you all sorts of advantages, like fast code and strong compile time type checking. But it comes with warts too, like fat binaries, and a bug in slice::sort() can't be fixed by just shipping of the std dynamic library, because there is no such library. It's been recompiled, just for you.
FWIW, modern C++ (like boost) that places everything in templates in .h files suffers from the same problem. If Swift suffers from it too, I'd wager it's the same cause.
It's partly by the type system. You can implement a std::sort (or slice::sort()) that just delegates to qsort or a qsort-like implementation and have roughly the same compile time performance as just using qsort straight.
But not having to is a win, as the monomorphised sorts are just much faster at runtime than having to do an indirect call for each comparison.
1 reply →
I just ran cargo check on nushell, and it took a minute and a half. I didn't time how long it took to compile, maybe five minutes earlier today? So I would call it faster, but still not fast.
I was all excited to conduct the "cargo check; mrustc; cc" is 100x faster experiment, but I think at best, the multiple is going to be pretty small.
Did you do it from a clean build? In that case, it's actually a slightly misleading metric, since rust needs to actually compile macros in order to typecheck code that uses them. (And therefore must also compile all the code that the macro depends on.) My bad for suggesting it, haha. Incremental cargo check is often a better way of seeing how long typechecking takes, since usually you haven't modified any macros that will need to be recompiled. On my project at work, incremental cargo check takes `1.71s`.
1 reply →
A ton of that is actually still doing codegen (for the proc macros for example).
Yes but I'd also add that Go specifically does not optimize well.
The compiler is optimized for compilation speed, not runtime performance. Generally speaking, it does well enough. Especially because it's usecase is often applications where "good enough" is good enough (IE, IO heavy applications).
You can see that with "gccgo". Slower to compile, faster to run.
Go defaults to an unoptimized build. If you want it to run heavy optimization passes, you can turn those on with flags. Rust defaults to doing most of those optimizations on every build and allows you to turn them off.
Is gccgo really faster? Last time I looked it looked like it was abandoned (stuck at go 1.18, had no generics support) and was not really faster than the "actual" compiler.
Digging around, looks like it's workload dependent.
For pure computational workloads, it'll be faster. However, anything with heavy allocation will suffer as apparently the gccgo GC and GC related optimizations aren't as good as cgo's.
Not really. The root reason behind Go's fast compilation is that it was specifically designed to compile fast. The implementation details are just a natural consequence of that design decision.
Since fast compilation was a goal, every part of the design was looked at through a rough "can this be a horrible bottleneck?", and discarded if so. For example, the import (package) system was designed to avoid the horrible, inefficient mess of C++. It's obvious that you never want to compile the same package more than once and that you need to support parallel package compilation. These may be blindingly obvious, but if you don't think about compilation speed at design time, you'll get this wrong and will never be able to fix it.
As far as optimizations vs compile speed goes, it's just a simple case of diminishing returns. Since Rust has maximum possible perfomance as a goal, it's forced to go well into the diminishing returns territory, sacrificing a ton of compile speed for minor performance improvements. Go has far more modest performance goals, so it can get 80% of the possible performance for only 20% of the compile cost. Rust can't afford to relax its stance because it's competing with languages like C++, and to some extent C, that are willing to go to any length to squeeze out an extra 1% of perfomance.
> Go has sub-second build times even on massive code-bases.
Unless you use sqlite, in which case your build takes a million years.
Try https://github.com/ncruces/go-sqlite3 it runs sqlite in WASM with wazero, a pure Go WASM runtime, so it builds without any CGo required. Most of the benchmarks are within a few % of the performance of mattn/go-sqlite3.
Yeah, I deal with multiple Go projects that take a couple minutes to link the final binary, much less build all the intermediates.
Compilation speed depends on what you do with a language. "Fast" is not an absolute, and for most people it depends heavily on community habits. Rust habits tend to favor extreme optimizability and/or extreme compile-time guarantees, and that's obviously going to be slower than simpler code.
Thats not really true. As a counter example, Ocaml has a very advanced type system, full typeinference, generics and all that jazz. Still its on par, or even faster to compile than Go.
Dlang compilers does more than any C++ compiler (metaprogramming, a better template system and compile time execution) and it's hugely faster. Language syntax design has a role here.