Comment by JonChesterfield
1 year ago
The default aliasing model interacts extremely poorly with atomic and vector types. It also means "malloc" can't be written in C, which really should be a sign that the language was devolving into nonsense. I don't want to write malloc in asm because ISO think C is a plausible candidate for writing application code.
(I also want to do things like mutate bytes of the machine code but overall I can make peace with doing that in buffers that aren't currently executing. It's very much not allowed to cast some bytes to a function pointer, even if you've got the ABI right, and that's ridiculous)
Well, I am glad to inform you that I am writing a language (working name "Troglodyte") that will have exactly the desired ("do what I wrote, damn it!") semantics with regards to memory allocations, pointer casting etc. specifically for cavemen like me who sometimes want something more high-level than assembler but still low-level enough to be able to accidentally drop a sharp rock on my foot.
It will have machine-native integers, arrays of such integers, arrays of bytes, functions that take/return machine-wide integers, and that's pretty much it for the data types. All arithmetic is two-complement, signed or unsigned as you desire, and pointers are just numbers as well. Quality-of-life features include being able to straight-up include arbitrary binaries as function definitions:
I'm very sympathetic to wanting a language that gives you direct access to the same control assembly does while letting you delegate register allocation / isel / calling convention and so forth to a compiler where it doesn't matter so much.
LLVM IR is interesting in that respect. Some libraries are written in it directly (all compiler runtimes afaik but there are probably others). The ergonomics on it are rather poor but writing exactly the control flow graph you want does work. It doesn't give control over register allocation or scheduling but does give control over the CFG and a decent approximation to instruction selection.
Some years ago I wrote compute kernels in terms of vector types and compiler intrinsics, including some that could target DSP style compute-and-branch operations. Adjusting the code and the compiler simultaneously worked really well - matched and thus replaced the handwritten assembly for the simpler cases. I miss that ISA / toolchain combination.
I'd be curious to hear what troglodyte can handle that C with compiler extensions and the type aliasing optimisations discarded cannot. Specifically the "language" I really should get around to writing for x64 and/or amdgpu would be "C" with:
1. an intrinsic for each instruction, with a function type. Probably as a header file with a lot of declarations in it.
2. mark variables as allocated to specific registers, or maybe in a set of registers. Syntax.
3. some notation for bespoke calling conventions. Syntax.
4. some notation for ordering instructions (i.e. scheduling them)
5. a lattice of address spaces (global outlives stack etc, a bit gpu specific)
The intrinsic per instruction definitely works. That gives a pretty way to write AVX code or similar. Marking variables as allocated to specific registers also works (gcc does this to an extent with the asm annotation). Point 3 and onwards is where I get hazy on how to make it work sensibly. Custom calling conventions work really well in a compiler backend, and in C they need to be expressed in the function type, but I'm still obsessing how best to do that.
What part in aliasing prevents malloc() to be written in C? All pointers can be cast to char * and back. Casting from void * is also legal.
Pointers to unallocated memory are not allowed to be read from, in any form. You have to manage memory with uintptr_t arithmetic which arguably makes pointer arithmetic in C almost completely useless for writing malloc().
That's not a consequence of effective type (i.e., strict aliasing) rules, that's a consequence of pointer provenance rules. And a little more generally, it's a consequence of malloc having special object system behavior that there's no way to express in pure C code.