Comment by bee_rider

7 hours ago

Were there any downsides or difficulties?

It would be sort of surprising if an SVD-based opportunity was missed (since it is such a familiar tool). But, your entropy and least-squares ideas are necessary to set that up, so I guess it makes sense that you’d find some new territory here.

That’s a great point and yeah, I’d agree SVD itself isn’t new at all.

On downsides: definitely a few. The biggest one is latency - SVD is fairly heavy, so even though it’s amortized (runs periodically, not per token), it still adds noticeable overhead. It’s also more complex than simple pruning, and I haven’t validated how well this holds on real downstream tasks yet.

This is very much a research prototype right now more about exploring a different tradeoff space than something ready for production.