← Back to context

Comment by coderenegade

3 days ago

It's fairly standard to prune the hell out of a model for deployment, because many of the parameters end up being close to zero. This doesn't really help with explainability of the parameters, because (imo) that's a dead end. You assume that the data is iid and a representative sample of whatever god-given function generated it, and you throw a universal approximator at it because it's impossible to come up with some a priori function that models the data in the first place.

Latent space clustering is about as good as it gets imo, and in my experience, that's fairly stable for individual implementations (but not necessarily across implementations for the same model, for various reasons), but it doesn't tell you anything about the meaning of the parameters themselves. If the model is well calibrated, you can validate its performance and it becomes explainable as a unit.