Comment by operator-name
3 years ago
I wouldn't worry about that too much as someone's already done something similar for reddit (https://towardsdatascience.com/using-nlp-to-identify-reddito...), and has released their code publicly (https://github.com/jabraunlin/reddit-user-id)
Given the technique used, I don't see why something simple and local wouldn't defeat it? The "easiest" technique would be to use this weighting as a negative metric in rewriting.
No comments yet
Contribute on Hacker News ↗