Comment by jacquesm
15 years ago
This is not the 'hacker news' ranking algorithm, this is the ranking algorithm distributed with 'ARC', which is the basis for the HN algorithm, but definitely not equal to it.
The biggest missing ingredients are flagged posts dropping off quicker and posts that contain no URL dropping off quicker but there are quite a few other subtle tweaks.
The (very good) reason why the ARC sources do not give out the real ranking algorithm is to make it a bit harder to game the system.
He glossed over it, but this code does include URL-free posts dropping off faster. See the stuff dealing with "nourl-factor*". I don't know arc at all, but it appears that having no URL multiplies your final score by a factor of .4, meaning that it's ranked almost 3x lower than it otherwise would be. That surprises me; I've noticed that Ask HN get rated lower, but it doesn't seem that extreme.
So is Hacker News is a fork of news.arc, rather than straight news.arc? I figured it was, but never heard that officially (since people refer to news.arc as the HN "source code").
Edit: Also, the "lightweight" thing is interesting. There's something in place that sees if the post is a "rallying cry" or is mostly made of images. Additionally, if you link directly to an image file, or to some list of domains that have been deemed lightweight, that'll get marked as lightweight as well. Lightweight posts have a .3 factor, meaning that they're even more deflated than URL-free posts.
Ah yes, you're right, the 'nourl' is there, it's in the arc bit, but I couldn't find that in the graphs or in the python code.
The current HN algorithm probably follows the same basic idea, but the implementation is more complex because of spammers.
I did a fast test of how placements would look like with the vanilla HN algorithm using the current frontpage: http://paste.plurk.com/show/316811/
The code to generate the new sorting: http://paste.plurk.com/show/316812/
As you can see the rankings are not the same, but very similar...
Flags are the main factor that you've missed I'm not quite sure how to measure them reliably but from what I've seen in terms of 'jumps' downward on a heavily flagged post it looked like 6 flags will take you off the homepage on to page two when you have about 20 upvotes. Maybe that will allow you to model it.
I have two will take you off when you are under 9 upvotes.
How did you know how many flags it had?
1 reply →