Comment by chrismorgan

11 hours ago

There are many patches of almost-identical sites.

Some of them are due to many people using the same theme.

Some of them are expired or parked domains, which I reckon should be detected and excluded.

Yeah those clusters are interesting. They stand out, so they are the first thing I zoomed in on, then I realized they're all just stock resume sites. Quickly realize the clusters are something to avoid. Turns out to be an effective visualization method.

  • The thing I find interesting is where the grouping is robust to colour variations: one of the bigger groups is around 25% from left, 20% from bottom, all one theme but in a wide variety of colours.

>Some of them are due to many people using the same theme.

Teeming masses of sites using what probably seems to the authors as a fresh, unconventional look but ends up being Yet Another.

  • I doubt anyone selecting a popular theme is confused by the fact that it’s popular. I use the default Mediawiki theme for mine, for instance.