Comment by teleforce
11 hours ago
>I still don’t understand why we lack a language that will take uncomplicated computation heavy code and turn it into SIMD / multi thread / multiprocessing / GPU code with minimal additional syntax.
It's already (partly) existed called D language, by default it's garbage collected (GC), can also be program without it or hybrid. It's a modern, backward compatible with C and it's included in GCC.
The linear algebra system in D or Mir GLAS is standalone BLAS implementation written directly in D [1]. It's already proven faster than the other widely existing conventional BLAS like OpenBLAS back in 2016, about ten years ago!
This popular OpenBLAS include Fortran based LAPACK (yes you read it right Fortran) and it is being used by almost all data processing languages currently Matlab, Julia, Rust and also Mojo [2].
Interestingly there is a very early stage of standalone BLAS implementation written directly in Mojo namely mojoBLAS similar to Mir GLAS just started very recently [3].
>Surely this is the sort of thing compiler / language design nerds dream about?
You can say this again.
Especially on the GC side of the programming language since this SIMD / multi thread / multiprocessing / GPU can be abstracted away.
Actually someone recently proposed VGC or virtualized garbage collector for Python in C++ for heteregenous GC [4],[5]. However, the current evaluation excludes JIT compilation, AOT optimization, SIMD acceleration, and GPU offloading.
[1] OpenBLAS:
https://en.wikipedia.org/wiki/OpenBLAS
[2] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:
http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...
[3] mojoBLAS:
https://github.com/shivasankarka/mojoBLAS
[4] Virtual Garbage Collector (VGC): A Zone-Based Garbage Collection Architecture for Python's Parallel Runtime:
https://arxiv.org/abs/2512.23768
[5] VGC-for-arxiv:
I don't think mojo depends on OpenBLAS or other BLAS implementation. I remember that they took a lot of pride in the early days how linalg primitives like matmul which was completely written in mojo was faster than MLK, openBLAS and other implementations.
Delightful thank you! Would love to see a version of D that auto vectorizes to Vulkan or something