Comment by jchandra

5 hours ago

That’s a great point and yeah, I’d agree SVD itself isn’t new at all.

On downsides: definitely a few. The biggest one is latency - SVD is fairly heavy, so even though it’s amortized (runs periodically, not per token), it still adds noticeable overhead. It’s also more complex than simple pruning, and I haven’t validated how well this holds on real downstream tasks yet.

This is very much a research prototype right now more about exploring a different tradeoff space than something ready for production.