Comment by gjm11

9 months ago

My understanding of the situation is that:

1. Waksman's algorithm works in any commutative ring admitting division by 2.

2. In particular, it won't work when the matrix entries are themselves matrices, which means you can't use it recursively to get an algorithm for n-by-n matrices with large n with a better exponent than you get from Strassen's algorithm.

3. The Deep Mind paper is annoyingly unexplicit about whether the algorithm it reports has that property or not.

4. What they say about tensors suggests that their algorithm can be used recursively to do better than Strassen (but, note, there are other algorithms that are substantially better for very large n which using their algorithm recursively would very much not outperform) but it's possible I've misunderstood.

5. They explicitly talk about complex-valued matrices, but I think they don't mean "complex numbers as opposed to matrices, so you can't do this recursively" but "complex numbers as opposed to real numbers, so our algorithm doesn't get you a 4x4 matmul using 48 real multiplications".

I am not certain about points 4 and 5. The language in the paper is a bit vague. There may be supporting material with more details but I haven't looked.

2 comments

gjm11

wbhart 9 months ago

1. Correct

2. Correct, however you can use Waksman as a basecase and always beat Strassen (though it is not asymptotically better of course).

5. Possible, but even so, there is already an algorithm that will work with 46 real multiplications (and some divisions by 2). The real numbers are commutative and admit division by 2.

gjm11 9 months ago

My point about #5 was just that their emphasis on "complex numbers" isn't necessarily (and I think probably isn't) an admission that their algorithm can't be applied recursively.
If it can be, then it is a genuine advance in the sense that it yields faster large-n matmuls than were previously available just by recursive application of a 4x4 matmul algorithm.
None of which, to be clear, makes it OK that they don't make any mention of earlier work that does 4x4 matmuls over commutative rings faster than their algorithm does. I suspect they didn't check the literature carefully enough before publication. In any case, good scholarship means being clear about what in your work is a genuine advance on what's already known and what isn't, and they failed to do that here.