Comment by CrypticShift
3 years ago
Ingenious idea. At the very least, this is just about finding people who write like us, the same way we seek those with similar tastes (music...)
How long before large commercial indexers start offering an efficient (AI based ?) stylometry to agencies and states ?
wait... do you think the NSA is already doing this?
They would be silly not to ( apart from creepish profiling of an entire globe population you also get to potentially identify bots ). We all have mannerisms that can easily 'betray us' online. I honestly thought my writing style is more unique, but as it turns out it is somewhat common.
It isn't writing style, but more of phrase selection. If you lean on the same phrases (n-grams), then you will be very very close in a high dimensional space. Colloquialisms are the biggest tell, you should eschew them.
> I honestly thought my writing style is more unique
You just showed another possible use case for this kind of tools: "How unique is my writing style ?"
Stylometry is an old hat technique; you can assume that intelligence services around the globe regularly apply it.
(Statistical stylometry is a little newer and more rigorous than manual stylometry, which essentially involved a human being's judgement call around the similarity of documents.)
What about "deep leaning" stylometry ?
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=deep...
Yields some results
This one seems pretty interesting
https://www.mdpi.com/2227-7390/10/5/838
I don't know, but it wouldn't surprise me if someone has tried to apply ML to stylometry. Statistical stylometry is already petty effective, as demonstrated by this site.