Comment by p1esk

6 days ago

“Useful” does not mean “better”. It just means “we could not do dense”. All modern state of the art models use dense layers (both weight and inputs). Quantization is also used to make models smaller and faster, but never better in terms of quality.

Based on all examples I’ve seen so far in this thread it’s clear there’s no evidence that sparse models actually work better than dense models.