← Back to context

Comment by nico

4 months ago

I just built a logistic regression classifier for emails and agree

Just using embeddings you can get really good classifiers for very cheap

You can use small embeddings models too, and can engineer different features to be embedded as well

Additionally, with email at least, depending on the categories you need, you only need about 50-100 examples for 95-100% accuracy

And if you build a simple CLI tool to fetch/label emails, it’s pretty easy/fast to get the data