Comment by costco

4 days ago

What do you think the probability that someone else read 15 books you also read is? It’s very unlikely unless they are all staples of a genre, part of the same series, or just extremely popular in general. 3-5 books is how much I would use on that page. I have found interesting accounts of medievalists, people who work at think tanks, etc with it.

Fake users I would agree should be filtered, but I don’t think filtering out users who gave it a bad review is necessarily the intended behavior. If I put in 3 semi obscure Russian history books, I am presumably looking for someone who is an expert in Russian history to see what else they read. In that case I don’t care if they didn’t like one of the books or not. Approximate matches would require something like LSH or cosine similarity of average input book embedding against average embedding of read books of every user which I think wouldn’t work well anyone for retrieving anyone with a moderately long interaction history.

I wanted to find users that loved the same kinds of classical novels. The core of my list was each famous work of famous classical writers like Dostoievsky, Tolstoi, Huxley and Borges. I added a few excellent authors, still famous but to a lesser degree, like Italo Calvino or Marguerite Yourcenar. I know there are many readers of the whole list I wrote, I could name a few among my friends and family.

So I think the problem was not in the existence of similar readers, but in the way to reach them. Few people that read classical books log in Goodreads (I don't) and even fewer input what they've read over the past decades.