Comment by srean
14 hours ago
For linear algebraic transformation applied to several rows at once, I wholeheartedly agree.
Not so convinced about decision trees though (that process one row at a time).
Yeah, unless you had to deal with arbitrarily large integer features, Guile integers would come with a big efficiency hit.
What do you mean by processing one row at a time?
I think one could parallelize processing rows, at the very least when classifying from learned model. Probably also during learning the model.
Yes you can certainly do that.
What I had not articulated well is that linear classifiers have the opportunity to use matvecs that have a different level of L1 L2 cacheable goodness and non-branchy code. There using proper memory layout gives an outstanding win. The win for decision trees are less impressive in comparison, so you needn't be feeling bad about your code.