← Back to context

Comment by Rochus

13 hours ago

I don’t like LLVM either, because its size and complexity are simply spiraling out of control, and especially because I consider the IR to be a total design failure. If I use LLVM at all, it would be version 4.0.1 or 3.4 at most. But it is the standard, especially if you want to run tests related to the question the fellow asked above. The alternative would be to build a frontend for GCC, but that is no less complex or time-consuming (and ultimately, you’re still dependent on binutils). However, C on LLVM or GCC should probably be considered the “upper bound” when it comes to how well a program can be optimized, and thus the benchmark for any performance measurement.

> However, C on LLVM or GCC should probably be considered the “upper bound” when it comes to how well a program can be optimized, and thus the benchmark for any performance measurement.

Is it? Isn't it rather the case that C is too low level to express intent and (hence) offer room to optimize? I would expect that a language in which, e.g. matrix multiplication can be natively expressed, could be compiled to more efficient code for such.

I would rather expect, that for compilers which don't optimize well, C is the easiest to produce fairly efficient code for (well, perhaps BCPL would be even easier, but nobody wants to use that these days).

  • > I would expect that a language in which, e.g. matrix multiplication can be natively expressed, could be compiled to more efficient code for such.

    That's exactly the question we would hope to answer with such an experiment. Given that your language received sufficient investments to implement an optimal LLVM adaptation (as C did), we would then expect your language to be significantly faster on a benchmark heavily depending on matrix multiplication. If not, this would mean that the optimizer can get away with any language and the specific language design features have little impact on performance (and we can use them without performance worries).

When you call LLVM IR a design failure, do you mean its semantic model (e.g., memory/UB), or its role as a cross-language contract? Is there a specific IR propert that prevents clean mapping from Oberon?

  • Several historical design choices within the IR itself have created immense complexity, leading to unsound optimizations and severe compile-time bloat. It's not high-level enough so you e.g. don't have to care about ABI details, and it's not low-level enought to actually take care of those ABI details in a decent way. And it's a continuous moving target. You cannot implement something which then continus to work.