Comment by adimitrov

13 years ago

Sounds like a job for Sentiment Analysis [1]. Modern systems are pretty good at discerning negative from positive comments.

You could probably find a way to mark negative and positive comments. Whether the resulting algorithm would be fine-grained enough to semi-reliably mark 'middlebrow dismissal,' I really don't know. Actually, as somebody who has worked on that stuff in the past, I don't think it would be very easy.

[1] http://en.wikipedia.org/wiki/Sentiment_analysis

6 comments

adimitrov

pmelendez 13 years ago

Thanks for the link.

Agreed I don't think it would be an easy task, but I wonder how would perform a "bag of word" approach.

Harder part as I see it would be to categorize the comments on middlebrow dismissal / Not dismissal. It seems like we would be spending more time preparing the data than in the algorithm itself.

disgruntledphd2 13 years ago
Welcome to statistics/data science/machine learning.
The hardest part is always getting the data into usable form. Its not as much fun as fitting models, but its definitely the majority of any role where people pay you to do this kind of stuff.
There's a lot of good research on forums (pm me if you want a bibliography i collected for a previous role), and short texts have become a bigger deal post Twitter. I completely agree with pg on the somewhat annoying nature of comments such as the GP.
- pmelendez 13 years ago
  
  Hey... Thanks for the offer! How can I contact you? I didn't know we can pm users over here.
  I did have worked with ML before but mostly with images which are (IMHO) way easier to put in a format depending on the problem.
  
  1 reply →

benvanderbeek 13 years ago

Bing Liu is one of the researchers working on this. I've discussed Amazon review fraud detection with him. http://www.cs.uic.edu/~liub/

hntester123 13 years ago

Hadn't heard of it; thanks for the link.