Comment by alexsmirnov

1 day ago

Considering how little data needed to poison llm https://www.anthropic.com/research/small-samples-poison , this is a way to replace SEO by llm product placement:

1. create several hundreds github repos with projects that use your product ( may be clones or AI generated )

2. create website with similar instructions, connect to hundred domains

3. generate reddit, facebook, X posts, wikipedia pages with the same information

Wait half a year ? until scrappers collect it and use to train new models

Profit...

14 comments

alexsmirnov

lubujackson 9 hours ago

It is a valid concern. We are firmly in the goldilocks phase of LLMs, like in the first couple of years of Google when it was truly amazing. Then SEO made Google defensive, then websites catered to Google and not users, then Google catered to Google and not websites and we end up with 30 page recipe sites.

LLMs are obviously different and will have different challenges, but their advantage is how deep into a user's request they go. Advertising comes down to a binary choice - use product X or not. If I want implementation instructions for a certain product on specific hardware an ad will be obviously out of place and irrelevant.

So "shopping comparison" asks might get broken, but those have been broken for a while.

wrs 7 hours ago

There wouldn't be an "ad" anywhere, though. You'll just ask the LLM for alternative implementations in plan mode, and it will be selling you one of them during the conversation rather than giving you an unbiased comparison. If you become suspicious it will make sure the pros just slightly outweigh the cons, or mention how well the thing works with something else in your stack, or whatever else a skilled salesperson would do to guide your choice without you realizing.
It's already doing this by telling everyone to use React and Tailwind, it's just that nobody's getting paid for it to do that.
xnx 5 hours ago

> Then SEO made Google defensive, then websites catered to Google and not users,
Google was created in response to simple proto-SEO techniques (e.g. keyword stuffing) that already ruined Alta Vista.
Google has been combating adversarial information retrieval since inception.
Google's background with that is one of the reasons to expect they will stay on top of the AI race. The recipe is: lots of good/novel data x careful weighting of trust x algorithm.

nikcub 1 day ago

from my understanding Anthropic are now hiring a lot of experts in different who are writing content used to post-train models to make these decisions and they're constantly adjusted by the anthropic team themselves

this is why the stacks in the report and what cc suggests closely match latest developer "consensus"

your suggestion would degrade user experience and be noticed very quickly

asawfofor 1 day ago
I guess that’s why I’m not seeing anyone trying to build a skills marketplace for agent skills files. The llm api will read in any skills you want to add to context in plain text, and then use your content to help populate their own skills files.
- xyzzy123 21 hours ago
  
  So I wonder about sharable skills? Like if it's a problem that lots of people have, I find the base model knows about it already.
  But how to do things in your environment? The conventions your team follow? Super useful but not very shareable.
  Whats left over between those extremes does not seem to be big enough to build an ecosystem around.
  Final problem, it seems difficult to monetise what is effectively a repo of llm generated text files.
- fragmede 1 day ago
  
  isn't that https://lobehub.com/ ?
sarchertech 1 day ago
That sounds too expensive to be viable when the giveaway phase ends.
- hedora 11 hours ago
  
  That's how Google search worked back when it was at its most useful. They had a large "editorial team" that manually tweaked page ranks on a site-by-site basis.
  The core graph reputation based page ranking algorithm lasted for a hot second before people started gaming it. No idea what they do these days.
  
  1 reply →

homarp 1 day ago

https://www.bbc.com/future/article/20260218-i-hacked-chatgpt... says it took way less than half a year to 'pollute' a LLM

verdverm 1 day ago

that's very different and was more akin to prompt injection or engineering, depending on your perspective, with a very specific query to make it happen (required a web fetch).

miki123211 11 hours ago

This is the major point the anti-scraping crowd misses.

If you want your ideas to be appreciated, you should do everything in your power to put those ideas into the brains of LLMs. Like it or not, LLMs is how people interact with the world now.