← Back to context

Comment by dpflan

2 years ago

Indeed, I would think the core search product as another example of ai/ml...

The question is whether greater use of AI correlates with the declining quality of search results.

  • I think the real underlying cause is the explosion of garbage that gets crawled. Google initially tried to use AI to find "quality" content in the pile. It feels like they gave up and decided to use the wrong proxies for quality. Proxies like "somehow related to a brand name". Good content that didn't have some big name behind it gets thrown out with the trash.

  • I think the bottom line (profit) inversely correlates with the quality of search results. I've been using phind.com lately and it seems there can be search without junk even in this age.

    Google has lots of people tagging search rankings, which is very similar with RLHF ranking responses from LLMs. It's interesting that using LLMs with RLHF it is possible to de-junk the search results. RLHF is great for this task, as evidenced by its effect on LLMs.

    • Right. It’s less that their declining quality of search results is due to AI and more that the AI got really good at monetizing and monetizing and quality search results are sometimes in opposition.

      1 reply →

    • > I've been using phind.com lately and it seems there can be search without junk even in this age.

      A few reasons partially (if not fully) responsible for it might be:

      - Google is a hot target of SEO, not Phind.

      - If Google stops indexing certain low quality without a strong justification, there would be lawsuits, or people saying how "Google hasn't indexed my site" or whatever. How would you authoritatively define "low quality"?

      - Having to provide search for all spectrum of users in various languages, countries and not just for "tech users".

  • Web has grown by 1000x over years. The overall signal to noise ratio has been worsen, around by 100x and SEO has been become much more sophisticated and optimized against Google. A large fraction of quality content has been moving toward walled gardens. The goalpost is moving (much) faster than technologies.

    • Yup, and us humans produce as much garbage as we can too. "60 hours of black screen" type videos on YouTube that gotta be stored on CDNs across the globe, taboola's absolutely vile ads, endless scripted content made by content creators for the short term shock/wow value.

      The Internet is basically a rubbish dump now imo.

      1 reply →

  • I recently google searched "80cm to inches" and it gave me the result for "80 meters to inches". I can't figure out how it would make this mistake aside from some poorly conceived LLM usage

    • I highly doubt that this is related to any LLM use. It would breathtakingly uneconomical and completely unnecessary. It's not even interesting enough for an experiment.

  • It would be fun to see modern Google ran against a snapshot of the old web.

  • Maybe the declining quality of internet content has something to do with the declining quality of search results.

    There's a constant arms race between shitty SEO, walled gardens, low-quality content farms and search engines.

This does highlight the gap between SOTA and business production. Google search is very often a low quality, even user hostile experience. If Google has all this fantastic technology, but when the rubber hits the road they have no constructive (business supporting) use cases for their search interface, we are a ways away from getting something broadly useful.

It will be interesting to see how this percolates through the existing systems.

  • I am at first just saying that search as PageRank in the early days is a ML marvel that changed the way people interact with the internet. Figuring out how to monetize and financially survive as a business have certainly changed the direction of its development and usability.

Yes, it is very successful in replacing useful results with links to shopping sites.

  • This is because their searches are so valuable that real intelligence, i.e. humans, have been fighting to defeat google's AI over billions of dollars of potential revenue.

    We are just seeing remnants of that battleground.