Comment by nico
4 months ago
I just built a logistic regression classifier for emails and agree
Just using embeddings you can get really good classifiers for very cheap
You can use small embeddings models too, and can engineer different features to be embedded as well
Additionally, with email at least, depending on the categories you need, you only need about 50-100 examples for 95-100% accuracy
And if you build a simple CLI tool to fetch/label emails, it’s pretty easy/fast to get the data
I'm interested to see examples! Is this shareable?