Comment by pama
1 year ago
I am very familiar with these and other clustering methods in modern ML, and have been involved in inventing and publishing some such methods myself in various scientific contexts. The paper I cited above only used 3 nearest neighbors as one baseline IIRC; that is why I mentioned KNN. However, even boosted trees failed to reduce the loss as much as the algorithm learned from the data by the decoder transformer.
No comments yet
Contribute on Hacker News ↗