← Back to context

Comment by sanderjd

2 years ago

I think this is a good illustration of my frustration with this discussion: I don't think search has gotten bad, I think the web has gotten bad. It's weird to even conceptualize it as a big graph of useful hypertext documents. That's just wikipedia. The broader web is this much noisier and dubious thing now.

That's bad for google though! Their model is very much predicated on the web having a lot of signal that they can find within the noise. But if it just ... doesn't actually have much signal, then what?

The web has gotten bad because of what big search engines have encouraged. If they stopped incentivizing publishing complete garbage (by ruthlessly delisting low quality sites regardless of their ad quantity, etc) then maybe we'd see a resurgence of good content.

  • I don't think so. I think it's the inevitable outcome of giving all of humanity the ability to broadcast without curation.

    Or maybe we're saying essentially the same thing, but you think search engines should be doing that curation. But that was never my conception of what search engines are for.

    • I think we are indeed saying the same thing. However, I would like search engines to do some curation -- specifically, to remove results that deliver malware, are clones of other sites, and are just entirely content free (eg Microsoft's forums).

      I'll give Google credit: I haven't seen gitmemory or SO clones in a while. It took a few years but they seem to have dealt with them.

    • I disagree, the bad sites people are talking about are spam, not bad personal takes. They are written by people being paid to churn out content. This is now being done with AI. This is a result of search engines listing them.

      1 reply →

  • The web is bad because it is both popular and commercial. Every now and then I fantasize that just finding a sufficiently user-hostile corner would suffice to recreate the early internet experience of an online world nearly exclusively populated by anticommercial geeks.

But there's still plenty of signal. It isn't as if there are no working YouTube downloaders, or factually correct explanations of how transistors work. It's just that search engines don't know how to (or don't care enough about) disambiguating these good results from the mountains of spam or malware.

  • I think that both of you are correct. The internet has much more "noise" than in the past (partially due to websites gaming SEO to show up higher in Google's search results). As a result, Google's algorithm returns more "noise" per query now than it used to. It is a less effective filter through the noise.

    Imagine Google were like a water filter you install on your kitchen faucet to filter out unwanted chemicals from your drinking water. If as the years progress your municipal tap water starts to contain a higher baseline of unwanted chemicals, and as a result the filter begins to let through more chemicals than it did before, you'd consider your filter pretty cruddy for its use case. At the bare minimum you'd call it outdated. That is what is happening to Google search

On the one hand, I'm not sure the data corroborates that. If this is a web problem and not a search engine problem, then I'd expect every search engine to have the same pattern of scam results.

I'd also argue that finding relevant results among a sea of irrelevant results is the primary function of a search engine. This was as true in 1998 as it is today. In fact, it was Google's "killer feature", unlike Altavista and the likes it showed you far more relevant results.

  • Relevant is a difficult concept to agree on. In 1998 it was more about X != Y, that is being shown legit pages that just were not the correct topic.

    These days the results are apt to be the correct topic, but instead optimized for some other metric than what the user wants. For example downloading malware or showing as many crypto ads as possible.

    I don't expect every search engine to have the same scam results. Scammers target individual search engines with particular methodologies. Google does a lot of work to prevent crap on their engines, the issue is the scammers in total do far more.

  • If the web is being polluted by a nefarious search engine provider that is excluding the polluted pages from their algorithm, you wouldn't see the same pattern across search engines

    Not saying or even suggesting that's happening, but the logic isn't airtight