← Back to context

Comment by pmelendez

13 years ago

> "If there is one thing I wish I could do to improve HN it would be to detect this sort of middlebrow dismissal algorithmically."

Do you have some data that we could use to play with? That sounds like a nice problem to solve.

You already have the data I'd use: the text of the comments.

If anyone wants to try to train a filter to detect this sort of comment, I'd be very interested to see the result.

  • I'll consider that a challenge...

    To any that have experience getting comments data from HN -- what's the fastest, most polite way to do this? And am I correct in remembering that there's some aggressive rate-limiting for crawling the site?

  • I might bet that upvotes to middlebrow dismissal would be highly correlated with downvotes to the article itself, if we had downvotes to articles. In fact, I believe that these comments rise to the top because people are downvoting the article vicariously, by upvoting a rebuttal, however vacuous. In order to confirm this hypothesis, though, you'd have to collect data by implementing a downvote button on articles -- though it would not necessarily have to do anything in article-ranking terms ;)

    (That might introduce a confounding factor, though--namely that by alleviating people's urge to downvote the article by giving them a [nonfunctional] button to do just that, people might stop upvoting the dismissive comments. Hmm....)

  • >You already have the data I'd use: the text of the comments.

    Wouldn't that require real AI though? I thought for a minute that NLP (Natural Language Processing, not the other meaning(s) of the acronym) might help, but then thought that it may not work for cases where the comment is quoting another comment. Note: I'm not at all an expert in any of those fields, just interested.

    • Sounds like a job for Sentiment Analysis [1]. Modern systems are pretty good at discerning negative from positive comments.

      You could probably find a way to mark negative and positive comments. Whether the resulting algorithm would be fine-grained enough to semi-reliably mark 'middlebrow dismissal,' I really don't know. Actually, as somebody who has worked on that stuff in the past, I don't think it would be very easy.

      [1] http://en.wikipedia.org/wiki/Sentiment_analysis

      6 replies →

  • It might be easier to investigate submissions first and filter at this level if a trend is discovered. That is, a certain type of submission might attract a certain attitude of comment.