← Back to context

Comment by valzevul

2 hours ago

Hi, OP here!

TF-IDF was the first thing I tried - it works great for stopwords but it doesn't handle cross-language bleed of filler words well, and the short life-event messages ("he died", etc) use common words and get aggressively down-weighted.

I had some asymmetry analysis when looking at directional sentiment and per-person question rates - that's fun indeed!

I also went with the Jaccard convergence and the endearment categories instead of wordclouds, so that I could see how word choices are changing across time.