← Back to context

Comment by p1esk

6 days ago

The big thing is that sparse models allow you to train models with significantly larger dimensionality, blowing up the dimensions several orders of magnitudes.

Do you have any evidence to support this statement? Or are you imagining some not yet invented algorithms running on some not yet invented hardware?

Sparse matrices can increase in dimension while keeping the same number of non-zeroes, that part is self evident. Sparse weights models can be trained, you probably are already aware of RigL and SRigL, there is similar other related work on unstructured and structured sparse training. You could argue that those adapt their algorithm to be executable on GPUs and that none are training at x100 or x1000 dimensions. Yes, that is the part that requires access to sparse compute hardware acceleration, which exists as prototypes [1] or are extremely expensive (Cerebras).

[1] https://dl.acm.org/doi/10.1109/MM.2023.3295848

  • Unstructured sparsity cannot be implemented in hardware efficiently if you still want to do matrix multiplication. If you don’t want to do matrix multiplication you first need to come up with new algorithms, tested in software. This reminds me of what Numenta tried to do with their SDRs - note they didn’t quite succeed.

    • > Unstructured sparsity cannot be implemented in hardware efficiently if you still want to do matrix multiplication.

      Hard disagree. It certainly is a magnitude harder to design hardware for sp x sp MM, yes; it requires a paradigm shift to do sparse compute efficiently, but there are hardware architectures both in research and commercially available that do it efficiently. The same kind of architecture is needed to scale op graph compute. You see solutions at the smaller scale in FPGA and reconfigurable/dataflow accelerators, larger scale in Intel's PIUMA and Cerebras. I've been involved in co-design work of Graphblas on the software side and one of the aforementioned hardware platforms: the main issue with developing SpMSpM hardware lies more with the necessary capital and engineering investments being prioritized to current frontier AI model accelerators, not because of lack of proven results.