Comment by scottyah

2 months ago

Nobody has come up with a scalable metric for determining quality that can't be appropriated by SEO. Pagerank was one of the best for awhile (number sites that link to your site, weighted by their rank). Whether it be clicks, time on page, percentage of people who clicked onto the page then ended the session, etc it all gets gamed.

Like it or not, it's what the people want. The "trashy" movies, books, music, etc. all sell like wildfire, why do most people on hn think that the internet should be any different?

> Nobody has come up with a scalable metric for determining quality that can't be appropriated by SEO

Nor will it ever happen, at least as long as search is a Google monoculture. One effective player in the search space means that everyone sets their sight on the same target. Which naturally leads to SEO being an extremely one-sided arms race not in favor of Google - "good content" is hard to quantify, and whatever proxy (or combinations thereof) Google uses to estimate that quality will be ruthlessly uncovered, reverse engineered and exploited to its last inch by an unyielding, unending horde of "SEO optimizers".

The only way that Google maintains search quality is if it properly accounts for the fact that websites will adopt the path of least resistance, i.e. put in the least amount of effort to optimize only specifically the things that Google measures. Which means that the heuristics and algorithms Google uses to determine search rankings must always be a moving target whose criteria needs to be vigilantly updated and constantly curated by people. Any attempt to fully automate that curation will result in the cyber equivalent of natural selection - SEO-optimizing websites adapting by automating the production of content that hits just the right buttons while still being royally useless to actual visitors.

Pagerank worked as long as no one knew what the metric was and the old, hyperlinked web existed.

I think today we can use LLMs to decide what websites are shit. The wheel is turning. SEO artists will have to provide actually useful, non-spammy content. If Google doesn't do it, some uBlock like service will implement it on user side. Or we'll just use chatGPT with search and not see the cesspool at all. You can edit its system prompt to avoid shit in search results.

  • Your comment assumes both that current LLMs can scale to replace Google which seems unlikely from. both a business and compute perspective.

    And if they do, you'll get maybe a decade out of them before they succumb to the same problems as Google have.