Comment by vhantz

2 days ago

> Note that this skews the distribution heavily in favor of topics that are less common, but it should get the job done. Suggestions for improvements are welcome.

You could use the max number of paper in each topic to weight it and make the distribution uniform.

That's a great point. I thought about something similar, but I also realized the arXiv numbers are growing like crazy, so I wonder how long it'll take for the (hardcoded) numbers to be deprecated. One could of course add some kind of cronjob to update the numbers, but that sounds like a lot of work...