Comment by neonsunset
8 months ago
> The individual operations in the repository (e.g., dot product) look like they could be autovectorized. I'm assuming they aren't because of the use of a slice. I'm mildly curious if it could be massaged into something autovectorized.
> Most of my observations re: autovectorization in go have been on fixed sized vectors and matrices where SSE2 instructions are pretty readily available and loop unrolling is pretty simple.
Go does not have any form of autovectorization. The only way to access SIMD instructions in Go is through functions written in Goasm. Moreover, Go does not ship SIMD primitives in its math library which would not necessitate auto-vectorization by implementing inlineable functions with SIMD instructions instead.
> I'm curious if you could speak more to this? Is the concern that operations may get reordered?
Autovectorization brittleness is a large topic. Analysis is expensive, vectorization may be impossible due to violating program order or observable side effects. In addition to that it often needs multiple expensive optimization phases coupled with complex compiler IR and back-ends to efficiently target multiple platforms which does not fit well with Go's compiler design (at least such is my amateur impression from looking at its source code).
Go's compiler should not be treated as if it's in the same class with GCC or LLVM because it is anything but, it is a grade below .NET's RyuJIT/ILC and OpenJDK's HotSpot, with design decisions and practices that make Go a somewhat easier optimization target than .NET CIL which allows it to maintain relative parity at general-purpose code light on abstractions (if it is heavy on those, Go starts to fall behind).
Your message applies to one particular Go compiler from Google. But since you mention gcc and llvm, it is also possible to use them to compile Go. Each implementation has different trade-offs in quality of generated code, runtime and language features.
Okay, I heard this argument enough times to know it's unreasonable but feel free to prove me wrong :)
We have this go-attention library which seems like a perfect candidate for an alternate compiler. How do I get Go compiled to reasonably good, autovectorized result here?
Compile your whole program with gogcc?
2 replies →