← Back to context

Comment by lich_king

3 hours ago

It's easy to hand-curate a list of 5,000 "small web" URLs. The problem is scaling. For example, Kagi has a hand-curated "small web" filter, but I never use it because far more interesting and relevant "small web" websites are outside the filter than in it. The same is true for most other lists curated by individual folks. They're neat, but also sort of useless because they are too small: 95% of the things you're looking for are not there.

The question is how do you take it to a million? There probably are at least that many good personal and non-commercial websites out there, but if you open it up, you invite spam & slop.

> The question is how do you take it to a million?

Do you need to take it to a million in the same place? Is that still "small"?

Why not have 2000 hand curated directories instead?

I mainly use Kagi Small Web as a starting point of my day, with my morning coffee. Especially now when categories are added, always find something worth reading. The size here does not present a problem as I would usually browse 20-30 sites this way.

  • Right, but that basically works as a retro alternative to scrolling through social media. If you're looking for something specific, it's simultaneously true that there's a small web page that answers your question and that it's not on any "small web" list because the owner of the webpage never submitted it there, or didn't meet the criteria for inclusion.

    For example, I have several non-commercial, personal websites that I think anyone would agree are "small web", but each of them fails the Kagi inclusion criteria for a different reason. One is not a blog, another is a blog but with the wrong cadence of posts, etc.

My approach operates under the assumption that good, non-commercial webpages will be similar to other good webpages. Slop, SEO spam, and affiliate content will resemble other such content.

So a similarity-based graph/network of webpages should cluster good with good, bad with bad. That is what I've seen so far, anyway.

With that, you just need to enter the graph in the right place, something that is fairly trivial.