← Back to context

Comment by weird-eye-issue

5 hours ago

Let me let you in on a little industry "secret"

You can't trust those results no matter what

The pages that they pull in to source that data all contain affiliate links and companies contact websites to get their tools to the tops of those lists by paying money often monthly. I know this because I do this...

It's basically standard SEO but it also manipulates AI like ChatGPT very very easily

> It's basically standard SEO but it also manipulates AI like ChatGPT very very easily

There are key differences.

1) Google doesn't get paid for the SEO, so even is crime is involved, Google isn't directly responsible.

2) AI ads are unmarked, which is illegal pretty much everywhere. And because of the way LLMs work, it is impossible to tell where a given output came from, neither which part of the prompt/context nor whether it's from the prompt or training.

  • Why would AI ads be unmarked? Most of the Google AI search results I get show sources. They're just summarizing top results for you, injecting a ad shown as an ad into that isn't tremendously different than how Google worked before.

  • > 1) Google doesn't get paid for the SEO, so even is crime is involved, Google isn't directly responsible.

    Google doesn't get paid directly for the SEO but they definitely benefit monetarily. Do a recipe search and ask yourself if these are the results the user would like to see. Google benefits by not penalizing sites which litter themselves with ads. It's not that indirect.

  • I'm just talking about the methods that business owners can use for getting good SEO or AI recommendations are basically the same thing, not sure what point you are trying to make?

The cheat code for that used to be Reddit before they got growth-hacked 10+ years ago.

  • Actually Reddit is receiving more organic traffic than ever before and is more valuable to game than ever

    But yes actually I was doing this about 15 years ago in the men's fashion subreddit for one of my companies lol

Simplest way to do is by running affiliate program for your SaaS and shady marketers will do everything to get sales if it's profitable.

  • Eh not really

    They won't get you on any worthwhile list unless it's their own because it's too risky for them and any site they would publish it on would want to use their own affiliate link. Unless of course we are talking about something like Medium or YouTube which does work

    And then of course there's the fraudsters who will bid on branded keywords we have banned dozens of people for that

This is why local AI is so important

  • It's already being trained on "public" (ethical or otherwise) data. So, it already has ingested that kind of "optimization" during pre-training and training.

    I don't think you can fine-tune your way out of it.

    • People still think these things are smart. That if their word generator eats enough of the Internet, it will somehow give them the real information that's otherwise hidden. Or perhaps a better word; filter the bullshit.

      To filter bullshit it would first have to understand bullshit, and it doesn't. That's why an LLM will tell you the solution to a problem that doesn't work, and argue with you when you correct it.

      4 replies →

  • That doesn't solve this particular problem. Your local model was trained on reddit comments written by bots.

  • Local AI will have the bias that existed at the time of its training, which is different from no bias. For stuff that needs to be current, a local LLM would need to search the net regardless.

    • And since "no bias" isn't something that actually exists in reality when it comes to language or even anything near humans, "bias in local model I can introspect" will always be miles ahead of "bias I know is there, but cannot introspect".

  • How do you make sure that the model you run locally is not tainted? Is there even a way to confirm this without providing the complete training set?

    • Fwiw I just run kiwix/zeal locally which has old school search index of all articles in wiki/stackoverflow etc. That seems enough for most of my day to day use.

  • It's less compromised, but it's still basing the answer on compromised queries. This is why I pay for independent reviews (e.g Which) where their incentives are more aligned with yours.

  • Not if the models come from Google. The ads will be implicit in the model. X is better that Y an Z would be easy to add to a the training set.

  • Local AI models pull in search results just like ChatGPT does ...

    And they are trained on web data just like any other model...