← Back to context

Comment by GistNoesis

1 day ago

These optimizations are often hidden when the compiler need to do some loop interchange optimizations.

If the order of the operations is important because you don't have associativity then you can't legally do it.

You can have special flags (-fassociative-math) for floats which allows to treat them as if they are associative but these mean your program result will depend on which optimization the compiler picked.

And it turns out that these loop reordering optimizations are really useful when you need to do some backward automatic differentiation. Because all the loops are basically iterated in reverse for the automatically generated code of the backward pass.

But the memory access pattern for the backward pass are not contiguous if you don't interchange the loop order, which the compiler can't do legally because of floats. Nor can he then merge loops together. Which is really useful because if you can merge the forward pass with the backward pass then you don't have to keep values inside a "tape".

So basically you can't rely on compiler optimizations, so your auto-differentiator can't benefit from existing compiler progress. (You can have look either at Julia Zygote, or enzyme which rely on compiler optimizations chaining well). Or you write backward passes manually.