So we've accidentally built the world's most effective developer marketing channel. Write enough blog posts about your framework and Claude will recommend it to every developer on the planet for free. Tailwind didn't win because it's the best CSS solution. It won because it has the most tutorials per capita in the training set.
The CLAUDE.md workaround is telling. You have to write "DO NOT use React under any circumstances" like you're drafting a restraining order. Polite preferences get ignored. Turns out the model treats your architecture decisions the same way it treats "please" in prompts: noted, then disregarded.
Tailwind didn't win for either of these reasons (setting aside any personal positive/negative feelings I have about it). It won (in LLMs) because that's how the ML model works. The training data places the HTML and the styling info together. There's an extremely high signal to noise ratio because of that. You're going to get much fewer tokens that have random styles in it and require several fewer (or maybe even no) thought loops to get working styles. The surface API of selectors is also large and complex. Tailwind utility classes are not. They're either present on an element or not, and it's often the case that supporting classnames for the UI goal are present in close proximity on sibling, parent, or child elements. Even with vast amounts and multiple decades of more CSS to compare against in the training data, I suspect this is the case. Plus, the information is just spread more thinly and more flexible in terms of organization in a stylesheet. The result is you get lots of extra style rules that you didn't need/want and it's harder to one-shot or even few-shot any style implementation. If I'm even remotely right about this, it worth considering this impact in many other languages and applications. I've found the adverse effect to be reduced slightly as models/agents have improved but I feel it's still very much present. It's totally possible to structure data in a way that makes it easier to train on.
There's also a reasonable alignment between Tailwind's original goal (if not an explicit one) of minimizing characters typed, and a goal held by subscription-model coding agents to minimize the number of generated tokens to reach a working solution.
But as much as this makes sense, I miss the days of meaningful class names and standalone (S)CSS. Done well, with BEM and the like, it creates a semantically meaningful "plugin infrastructure" on the frontend, where you write simple CSS scripts to play with tweaks, and those overrides can eventually become code, without needing to target "the second x within the third y of the z."
Not to mention that components become more easily scriptable as well. A component running on a production website becomes hackable in the same vein of why this is called Hacker News. And in trying to minimize tokens on greenfield code generation, we've lost that hackability, in a real way.
I'd recommend: tell your AGENTS.md to include meaningful classnames, even if not relevant to styling, in generated code. If you have a configurability system that lets you plug in CSS overrides or custom scripts, make the data from those configurations searchable by the LLM as well. Now you have all the tools you need to make your site deeply customizable, particularly when delivering private-labeled solutions to partners. It's far easier to build this in early, when you have context on the business meaning of every div, rather than later on. Somewhere, a GPU may sigh at generating a few extra tokens, but it's worthwhile.
That is the issue. It's why Xcode development is really bad with AI models[0] -- because there are barely any text-based tutorials for it, so the models have to make a lot of assumptions and whatnot. Hence, they are really good at Python, JavaScript, and increasingly, Rust.
But what if tailwind has the most tutorials in the training set because it's worth learning, which led to it being fairly ubiquitous and easy to add to the training set?
I'm not expressing an opinion about that; it's a real question.
But what if Tailwind has the most tutorials because it's tricky and difficult? What if the intuitive, maintainable solution simply does not need so many tutorials?
I'm not expressing an opinion about that, I don't do front end dev so I have no opinion, it's a real question.
Interesting. I'm using go htmx adminlte. Never once has Claude recommended or tried to use tailwind. I sometimes have to remind it to use less JS and use htmx instead but otherwise feels pretty coherent.
I recommend starting projects by first creating a way of doing (architect) and then letting Claude work. It's pretty good at pretending to be you if you give it a good sample of how you do things.
Note: this applies to Opus 4.6. I have not had a useful experience in other models for actual dev work.
from my understanding Anthropic are now hiring a lot of experts in different who are writing content used to post-train models to make these decisions and they're constantly adjusted by the anthropic team themselves
this is why the stacks in the report and what cc suggests closely match latest developer "consensus"
your suggestion would degrade user experience and be noticed very quickly
Influencer seems like an insufficient word? Like, in the glorious agentic future where the coding agents are making their own decisions about what to build and how, you don't even have to persuade a human at all. They never see the options or even know what they are building on. The supply chain is just whatever the LLMs decide it is.
how is it a conflict of interest for a google product to have a bias towards using google products?
As users we must hold some accountability. AI is aiming to substitute for humans in the workforce, and humans would get fired for recommending competitor products for use-cases their own company is targeting.
If we want a tool that is focused on the best interest of the public users, then it needs to be owned by the public.
Advertisers will only pay if AI providers will provide them data on the equivalent of “ad impressions”. And unlabeled/non-evident advertisements are illegal in many (most?) countries.
It doesn't necessarily have to be advertisers paying AI providers. It could be advertisers working to ensure they get recommended by the latest models. The next form of SEO.
1. They can skip impressions and go right to collect affiliate fees.
2. Yes, the ad has to be labeled or disclosed... but if some agent does it and no one sees it, is it really an ad.
Probably closer to the Walmart / Amazon model where it's the arbiter of shelf space, and proceed to create their own alternatives (Great Value, Amazon Brand) once they see what features people want from their various SaaS.
Ist why I never give it such vague prompts. But it's sad it does not ask the user more. Also interesting and important to know how one would tease out good and correct information from llms in 2026. It's like relearning now to Google like it was 2006 all over again, except now it's much less deterministic.
I wonder how the tail of the distribution of types of requests fares e.g. engineer asking for hypothesis generation for,say, non trivial bugs with complete visibility into the system. A way to poke holes in hypothesis of one LLM is to use a "reverse prompt". You ask it to build you a prompt to feed to another LLM. Didn't used to work quite as well till mid 2025 as it does now.
I always take a research and plan prompt output from opus 4.6 especially if it looks iffy I feed it to codex/chatgpt and ask it to poke holes. It almost always does. The I ask Claude Code: Hey what do you think about the holes? I don't add an thing else in the prompt.
In my experience Claude Opus is less opinionated than ChatGPT or codex. The latter 2 always stick to their guns and in this binary battle they are generally more often correct about hypothesis.
The other day I was running Docker app container from inside a docker devbox container with host's socket for both. Bind mounts pointing to devbox would not write to it because the name space was resolving for underlying host.
Claude was sure it was a bug based to do with Zfs overlays, chatgpt was saying not so, that its just a misconfigurarion, I should use named volumes with full host paths. It was right. This is also how I discovered that using SQLite with litestream will get one really far rather than a full postgres AWS stack in many cases.
This is how you get the correct information out of LLMS in 2026.
I use a skill that addresses these short comings, it basically forces it to plan multiple times until the plan is very detailed. It also asks more questions
I use Codex CLI in my daily usage since just with my $20/month subscription to ChatGPT, I never gets close to the quota. But it trips up over itself every now and then. At that point I just use Claude in another terminal session. We only have a laughable $750 a month corporate allowance with Claude.
I'm running a server on AWS with TimescaleDB on the disk because I don't need much. I figure I'll move it when the time comes. (edit: Claude Code is managing the AWS EC2 instance using AWS CLI.)
Claude Code this morning was about to create an account with NeonDB and Fly.io (edit: it suggested as the plan to host on these where I would make the new accounts) although it has been very successful managing the AWS EC2 service.
Claude Code likely is correct that I should start to use NeonDB and Fly.io which I have never used before and do not know much about, but I was surprised it was hawking products even though Memory.md has the AWS EC2 instance and instructions well defined.
> Claude Code likely is correct that I should start to use NeonDB and Fly.io which I have never used before and do not know much about
I wouldn't be so sure about that.
In my experience, agents consistently make awful architectural decisions. Both in code and beyond (even in contexts like: what should I cook for a dinner party?). They leak the most obvious "midwit senior engineer" decisions which I would strike down in an instant in an actual meeting, they over-engineer, they are overly-focused on versioning and legacy support (from APIs to DB schemas--even if you're working on a brand new project), and they are absolutely obsessed with levels of indirection on top of levels of indirection. The definition of code bloat.
Unless you're working on the most bottom-of-the-barrel problems (which to be fair, we all are, at least in part: like a dashboard React app, or some boring UI boilerplate, etc.), you still need to write your own code.
I find they are very concerned about ever pulling the trigger on a change or deleting something. They add features and codepaths that weren't asked for, and then resist removing them because that would break backwards compatibility.
In lieu of understanding the whole architecture, they assume that there was intent behind the current choices... which is a good assumption on their training data where a human wrote it, and a terrible assumption when it's code that they themselves just spit out and forgot was their own idea.
From what you said it sounds like the conclusion should be "you still need to design the architecture yourself", not necessarily "you still need to write your own code".
> they are overly-focused on versioning and legacy support (from APIs to DB schemas--even if you're working on a brand new project)
I mean, DB schema versioning is one of the things that you can dismiss as "I won't need it" for a long time - until you do need it, at which point it will be a major pain to add.
Interesting to me that Opus 4.6 was described as forward looking. I haven't *really* paid attention, but after using 4.5 heavily for a month, the first greenfield project I gave Opus 4.6 resulted in it doing a web search for latest and greatest in the domain as part of the planning phase. It was the first time I'd seen it, and it stuck out enough that I'm talking about it now.
Probably confirmation bias, but I'm generally of the opinion that the models are basically good enough now to do great things in the context of the right orchestration and division of effort. That's the hard part, which will be made less difficult as the models improve.
If Claude chooses GitHub actions that often, well, that is DAMNING. I wasn’t prepared for this but jeez, GitHub actions are kind of a tarpit of just awful shitty code that people copy from other repos, which then pulls and runs the latest copy of some code in some random repository you’ve never heard of. Ugh.
I'm already seeing a degradation in experience in Gemini's response since they've started stuffing YouTube recommendations at the end of the response. Anthropic is right in not adding these subtle(or not) monetization incentives.
What coding with LLMs have taught me, particularly in a domain that's not super comfortable for me (web tech), is that how many npm packages (like jwt auth, or build plugins) can be replaced by a dozen lines of code.
And you can actually make sense of that code and be sure it does what you want it to.
So… this has been happening for a long time now. The baseline set of tools is a lot better than it used to be. Back in 2010, jQuery was the divine ruler of JSlandia. Nowadays, you would probably just throw your jQuery in the woodchipper and replace it with raw, unfinished, quartersawn JS straight from the mill.
I also used to have these massive sets of packages pieced together with RequireJS or Rollup or WebPack or whatever. Now it’s unnecessary.
(I wouldn’t dare swap out a JWT implementation with something Claude wrote, though.)
We used to reuse code a lot. But then we got problems like diamond dependency hell. Why did we reuse code a lot? To save on labor. Now we don't have to.
So we might roll-your-own more things. But then we'll have a tremendous amount of code duplication, effectively, and bigger tech debt issues, minus the diamond dependency hell issue. It might be better this way; time will tell.
Not just to save on labour. To have confidence in a battle tested solution. To use something familiar to others. For compatibility. To exploit further development, debugging, and integration.
Speaking of rolling your own things, i had claude knock out a trello clone for me in 30 minutes because i was irritated at atlassian.
I am already using it for keeping track of personal stuff. I’m not going to make a product out of it or even put it on github. It’s just for me. There are gonna be a lot of single team/single user projects.
It is so fast to build working prototypes that it’s not even worth thinking if you should do something. Just ask claude to take a shot of it, get a cup of coffee and evaluate the results.
This is funny to me because when I tell Claude how I want something built I specify which libraries and software patents I want it to use, every single time. I think every developer should be capable of guiding the model reasonably well. If I'm not sure, I open a completely different context window and ask away about architecture, pros and cons, ask for relevant links or references, and make a decision.
Now as I am understanding things from this article, what I am thinking is that we have a new component in the SEO sector that we need to keep in mind, we need to optimize our tools, codes, or packages in such a manner that they can be recognized and get picked by these AI tools. We need to make sure to explain the best way our tool can be use and which scenario is the perfect one to use this tool because if most developers are using Claude Code and it has it's favorites then those tools might become industry defaults. I think we have a new idea in the SEO services.
First, how did shadcn/ui become the go-to library for UI components? Claude isn't the only one that defaults to it, so I'm guessing it's the way it's pushed in the wild somehow.
Second, building on this ^, and maybe this isn't quantifiable, but if we tell Claude to use anything except shadcn (or one of the other crazy-high defaults), will Claude's output drop in quality? Or speed, reliability, other metric?
Like, is shadcn/ui used by default because of the breadth of documentation and examples and questions on stack overflow? Or is there just a flood of sites back-linking and referencing "shadcn/ui" to cause this on purpose? Or maybe a mix of both?
Or could it be that there was a time early on when LLMs started refining training sets, and shadcn had such a vast number of references at that point in time, that the weights became too ingrained in the model to even drop anymore?
Honestly I had never used shadcn before Gemini shoved it into a React dashboard I asked for mid-late-2025.
I think I'm rambling now. Hopefully someone out there knows what I'm asking.
I had the same question. There are older and more established component libraries, so why’d this one win? It seems like a scientific answer would be worth a lot.
I expect its synergy with Tailwind. Shadcn/ui uses Tailwind for styling components, and AIs love Tailwind, so it makes sense they'd adopt a component library that uses it.
And it's definitely a real effect. The npm weekly download stats for shadcn/ui have exploded since December: https://www.npmjs.com/package/shadcn
I've been using shadcn since before agents. It collects several useful components, makes them consistently styles (and customizable), and is easy to add to your project, vendoring if you need to make any changes. It's generally a really nice project.
LLMs are going to keep React alive for the indefinite future.
Especially with all the no-code app building tools like Lovable which deal with potential security issues of an LLM running wild on a server, by only allowing it to build client-side React+Vite app using Supabase JWT.
This is interesting data but the report itself seems quite Sloppy, and over presented instead if just telling me what "pointed at a repo" means and how often they ran each prompt over what time period and some other important variables for this kind of research.
We've been doing some similar "what do agents like" research at techstackups.com and it's definitely interesting to watch but also changes hourly/daily.
Definitely not a good time to be an underdog in dev tooling
Good report, very important thing to measure and I was thinking of doing it after Claude kept overriding my .md files to recommend tools I've never used before.
The vercel dominance is one I don't understand. It isn't reflected in vercel's share of the deployment market, nor is it one that is likely overwhelming prevalent in discourse or recommended online (possible training data). I'm going to guess it's the bias of most generated projects being JS/TS (particularly Next.js) and the model can't help but recommend the makers of Next.js in that case.
I found it a remarkable transition to not use Redis for caching from Sonnet 4.5 to Opus 4.6. I wonder why that is the case? Maybe I need to see the code to understand the use case of the cache in this context better.
Worth reading alongside recent research on AGENTS.md file effectiveness. The clearest use case for these files isn't describing your codebase, it's overriding default behavior. If your project has specific requirements around tooling (common in government and regulated industries), that's exactly what belongs in the AGENTS.md files.
In my experience the problem is how people write them. Descriptive statements get ignored because the model treats them as context it can reason past.
"We use PostgreSQL" reads as a soft preference. The model weighs it against whatever it thinks is optimal and decides you'd be better off with Supabase.
"NEVER create accounts for external databases. All persistence uses the existing PostgreSQL instance. If you're about to recommend a new service, stop." actually sticks.
The pattern that works: imperative prohibitions with specific reasoning. "Do not use Redis because we run a single node and pg_notify covers our pubsub needs" gives enough context that it won't reinvent the decision every session.
Your AGENTS.md should read less like a README and more like a linter config. Bullet points with DO/DON'T rules, not prose descriptions of your stack.
This seems web centric and I expect that colors the decision making during this analysis somewhat.
People are using it for all kinds of other stuff, C/C++, Rust, Golang, embedded.
And of course if you push it to use a particular tool/framework you usually won't get much argument from it.
Unrelated to the topic at hand but related to the technologies mentioned. I weep for Redux. It's an excellent tool, powerful, configurable, battle tested with excellent documentation and maintainer team. But the community never forgave it for its initial "boilerplate-y" iterations. Years passed, the library evolved and got more streamlined and people would still ask "redux or react context?" Now it seems this has carried over to Claude as well. A sad turn of events.
Redux is boring tech and there is a time and place for it. We should not treat it as a relic of the past. Not every problem needs a bazooka, but some problems do so we should have one handy.
Yup. I'm the primary Redux maintainer and creator of Redux Toolkit.
If you look at a typical Zustand store vs an RTK slice, the lines of code _ought_ to be pretty similar. And I've talked to plenty of folks who said "we essentially rebuilt RTK because Zustand didn't have enough built in, we probably should have just chosen RTK in the first place".
But yeah, the very justified reputation for "boilerplate" early on stuck around. And even though RTK has been the default approach we teach for more than half of Redux's life (Redux released 2015, RTK fall 2019, taught as default since early 2020), that's the way a lot of people still assume it is.
It's definitely kinda frustrating, but at the same time: we were never in this for "market share", and there's _many_ other excellent tools out there that overlap in use cases. Our goal is just to make a solid and polished toolset for building apps and document it thoroughly, so that if people _do_ choose to use Redux it works well for them.
Redux should not be used for 1 person projects. If you need redux you'll know it because there will be complexity that is hard to handle. Personally I use a custom state management system that loosely resembles RecoilJS.
Well, the tech du jour now is whatever's easier for the AI to model. Of course it's a chicken and egg problem, the less popular a tech is the harder it is to make it into the training data set. On the other hand, from an information theoretic point of view, tools that are explicit and provides better error messages and require less assumptions about hidden state is definitely easier for the AI when it tries to generalize to unknowns that doesn't exist in its training data.
Apparently the "API Layer" is "competitive", with TanStack Query and FastAPI as the leading options [1]. These are not at all alternatives to each other.
Really interesting. The crazy changes in opus 4.6 really make me think that Anthropic is doing library-level RL. I think that is also the way forward to have 'llm-native' frameworks as a way to not get stuck in current coding practices forever. Instead of learning python 3.15, one would license a proprietary model that has been trained on python 3.15 (and the migrations) and gain the ability to generate python 3.15 code.
They forgot the single most important (bad) choice. Claude Code chooses npm. All the time. For everything. I noted the Claude Code lead dev has a full line in AGENTS.md/CLAUDE.md - "Use bun." Yes. Please. Please, use bun. I beg you.
All projects done with Typescript, and the same tooling. The creativity of the LLM is quite biased. I would expect more reasoning and choosing other languages, platforms, libraries, etc.
def useful to show what models recommend in real use (over just meaningless benchmarks), but i still think small prompt wording and repo setup changes can change the outcome quite a bit so id love tighter controls there. having tried claude code with opus 4.6 with slightly different repo setups gives wildly different results IME. i also generally prefer to avoid the NIH syndrome and prefer using off-the-shelf libraries and specifically tell CC to do so - influences the choice outcomes by a lot
It really disappointing to see it so strongly preferring Github Actions which is in my experience terrible. Almost everything about GHA pushes you in the direction of constantly blowing out the 10GB cache limit in an attempt to have CI not run for ages. I also feel like the standard cache action using git works poorly with any tools that use mtime on files to determine freshness.
I guess at least Opus can help you muddle through GHA being so crappy.
Not sure what to make of this. React is missing entirely. Or is this report also assuming that React is the default for everything and not worth mentioning at all? Just like shadcn/ui's first mention of React is somewhere down the page or hidden in the docs?
Furthermore, what's the point of "no tools named"? Why would I restrict myself like that? If I put "use Nodejs, Hono, TypeScript and use Hono's html helper to generate HTML on the server like its 2010, write custom CSS, minimize client-side JS, no Tailwind" in CLAUDE.md, it happily follows this.
As someone who runs a small dev agency, I'm very interested in research like this.
Let's say some Doctor decides to vibecode an app on the weekend, with next to 0 exposure to software development until she started hearing about how easy it was to create software with these tools. She makes incredible progress and is delighted in how well it works, but as she considers actually opening it up the world she keeps running into issues. How do I know this is secure? How do I keep this maintained and running?
I want to be in a position where she can find me to get professional help, so it's very helpful to know what stacks these kinds of apps are being built in.
claudecode _loves_ shadcn/ui. I hadn't even heard of it until i was playing around with claudecode. It seems fine to me and if the coding agent loves it then more power to it, i don't really care. That's the problem.
I think that makes coding agent choices extremely suspect, like i don't really care what it uses as long as what's produced works and functions inline with my expectations. I can totally see companies paying Anthropic to promote their tool of choice to the top of claudecodes preferences. After thinking about it, i'm not sure if that's a problem or not. I don't really care what it uses as long as my requirements (all of them) are met.
I didn't read the report just the "finding" - but at least for launchdarkly it's nice that it chose a roll-your-own, i hate feature flag SaaS, but that's just me
The bias to build might mean faster token burn through (higher revenue for the AI co). But I think it's natural. I often have that same impulse myself. I prefer all the codebases I work on that have minimal external dependencies to the ones that are riddled with them. In Java land it's extremely common to have tons of external dependencies, and then upgrade headaches, especially when sharing in a monorepo type environment.
So we've accidentally built the world's most effective developer marketing channel. Write enough blog posts about your framework and Claude will recommend it to every developer on the planet for free. Tailwind didn't win because it's the best CSS solution. It won because it has the most tutorials per capita in the training set.
The CLAUDE.md workaround is telling. You have to write "DO NOT use React under any circumstances" like you're drafting a restraining order. Polite preferences get ignored. Turns out the model treats your architecture decisions the same way it treats "please" in prompts: noted, then disregarded.
Tailwind didn't win for either of these reasons (setting aside any personal positive/negative feelings I have about it). It won (in LLMs) because that's how the ML model works. The training data places the HTML and the styling info together. There's an extremely high signal to noise ratio because of that. You're going to get much fewer tokens that have random styles in it and require several fewer (or maybe even no) thought loops to get working styles. The surface API of selectors is also large and complex. Tailwind utility classes are not. They're either present on an element or not, and it's often the case that supporting classnames for the UI goal are present in close proximity on sibling, parent, or child elements. Even with vast amounts and multiple decades of more CSS to compare against in the training data, I suspect this is the case. Plus, the information is just spread more thinly and more flexible in terms of organization in a stylesheet. The result is you get lots of extra style rules that you didn't need/want and it's harder to one-shot or even few-shot any style implementation. If I'm even remotely right about this, it worth considering this impact in many other languages and applications. I've found the adverse effect to be reduced slightly as models/agents have improved but I feel it's still very much present. It's totally possible to structure data in a way that makes it easier to train on.
There's also a reasonable alignment between Tailwind's original goal (if not an explicit one) of minimizing characters typed, and a goal held by subscription-model coding agents to minimize the number of generated tokens to reach a working solution.
But as much as this makes sense, I miss the days of meaningful class names and standalone (S)CSS. Done well, with BEM and the like, it creates a semantically meaningful "plugin infrastructure" on the frontend, where you write simple CSS scripts to play with tweaks, and those overrides can eventually become code, without needing to target "the second x within the third y of the z."
Not to mention that components become more easily scriptable as well. A component running on a production website becomes hackable in the same vein of why this is called Hacker News. And in trying to minimize tokens on greenfield code generation, we've lost that hackability, in a real way.
I'd recommend: tell your AGENTS.md to include meaningful classnames, even if not relevant to styling, in generated code. If you have a configurability system that lets you plug in CSS overrides or custom scripts, make the data from those configurations searchable by the LLM as well. Now you have all the tools you need to make your site deeply customizable, particularly when delivering private-labeled solutions to partners. It's far easier to build this in early, when you have context on the business meaning of every div, rather than later on. Somewhere, a GPU may sigh at generating a few extra tokens, but it's worthwhile.
That is the issue. It's why Xcode development is really bad with AI models[0] -- because there are barely any text-based tutorials for it, so the models have to make a lot of assumptions and whatnot. Hence, they are really good at Python, JavaScript, and increasingly, Rust.
[0]: https://www.youtube.com/watch?v=J8-CdK4215Y
But what if tailwind has the most tutorials in the training set because it's worth learning, which led to it being fairly ubiquitous and easy to add to the training set?
I'm not expressing an opinion about that; it's a real question.
But what if Tailwind has the most tutorials because it's tricky and difficult? What if the intuitive, maintainable solution simply does not need so many tutorials?
I'm not expressing an opinion about that, I don't do front end dev so I have no opinion, it's a real question.
Interesting. I'm using go htmx adminlte. Never once has Claude recommended or tried to use tailwind. I sometimes have to remind it to use less JS and use htmx instead but otherwise feels pretty coherent.
I recommend starting projects by first creating a way of doing (architect) and then letting Claude work. It's pretty good at pretending to be you if you give it a good sample of how you do things.
Note: this applies to Opus 4.6. I have not had a useful experience in other models for actual dev work.
[delayed]
This is where LLM advertising will inevitably end up: completely invisible. It's the ultimate "influencer".
Or not even advertising, just conflict of interest. A canary for this would be whether Gemini skews toward building stuff on GCP.
Considering how little data needed to poison llm https://www.anthropic.com/research/small-samples-poison , this is a way to replace SEO by llm product placement:
1. create several hundreds github repos with projects that use your product ( may be clones or AI generated )
2. create website with similar instructions, connect to hundred domains
3. generate reddit, facebook, X posts, wikipedia pages with the same information
Wait half a year ? until scrappers collect it and use to train new models
Profit...
https://www.bbc.com/future/article/20260218-i-hacked-chatgpt... says it took way less than half a year to 'pollute' a LLM
1 reply →
from my understanding Anthropic are now hiring a lot of experts in different who are writing content used to post-train models to make these decisions and they're constantly adjusted by the anthropic team themselves
this is why the stacks in the report and what cc suggests closely match latest developer "consensus"
your suggestion would degrade user experience and be noticed very quickly
4 replies →
Richard Thaler must be proud. This is the ultimate implementation of "Nudge"
Influencer seems like an insufficient word? Like, in the glorious agentic future where the coding agents are making their own decisions about what to build and how, you don't even have to persuade a human at all. They never see the options or even know what they are building on. The supply chain is just whatever the LLMs decide it is.
how is it a conflict of interest for a google product to have a bias towards using google products?
As users we must hold some accountability. AI is aiming to substitute for humans in the workforce, and humans would get fired for recommending competitor products for use-cases their own company is targeting.
If we want a tool that is focused on the best interest of the public users, then it needs to be owned by the public.
Advertisers will only pay if AI providers will provide them data on the equivalent of “ad impressions”. And unlabeled/non-evident advertisements are illegal in many (most?) countries.
It doesn't necessarily have to be advertisers paying AI providers. It could be advertisers working to ensure they get recommended by the latest models. The next form of SEO.
4 replies →
> data on the equivalent of “ad impressions”.
1. They can skip impressions and go right to collect affiliate fees. 2. Yes, the ad has to be labeled or disclosed... but if some agent does it and no one sees it, is it really an ad.
So much to work out.
1 reply →
Maybe. Historically lots of ads had little to no stats and those ads were wildly more effective than anything we have today.
1 reply →
Probably closer to the Walmart / Amazon model where it's the arbiter of shelf space, and proceed to create their own alternatives (Great Value, Amazon Brand) once they see what features people want from their various SaaS.
An obvious one will be tax software.
I wonder if aggregators will emerge (something like Ground News does for news sources)
LLM pattern [0] will probably eventually emerge as the best way to fight those biases. This way everyone benefits from token burn!
[0](https://github.com/karpathy/llm-council)
> A canary for this would be whether Gemini skews toward building stuff on GCP
Sure it doesn't prefer THE Borg?
Anybody who thinks they put that much money into something and it's not COMPLETELY rigged is a ....
Ist why I never give it such vague prompts. But it's sad it does not ask the user more. Also interesting and important to know how one would tease out good and correct information from llms in 2026. It's like relearning now to Google like it was 2006 all over again, except now it's much less deterministic.
I wonder how the tail of the distribution of types of requests fares e.g. engineer asking for hypothesis generation for,say, non trivial bugs with complete visibility into the system. A way to poke holes in hypothesis of one LLM is to use a "reverse prompt". You ask it to build you a prompt to feed to another LLM. Didn't used to work quite as well till mid 2025 as it does now.
I always take a research and plan prompt output from opus 4.6 especially if it looks iffy I feed it to codex/chatgpt and ask it to poke holes. It almost always does. The I ask Claude Code: Hey what do you think about the holes? I don't add an thing else in the prompt.
In my experience Claude Opus is less opinionated than ChatGPT or codex. The latter 2 always stick to their guns and in this binary battle they are generally more often correct about hypothesis.
The other day I was running Docker app container from inside a docker devbox container with host's socket for both. Bind mounts pointing to devbox would not write to it because the name space was resolving for underlying host.
Claude was sure it was a bug based to do with Zfs overlays, chatgpt was saying not so, that its just a misconfigurarion, I should use named volumes with full host paths. It was right. This is also how I discovered that using SQLite with litestream will get one really far rather than a full postgres AWS stack in many cases.
This is how you get the correct information out of LLMS in 2026.
I use a skill that addresses these short comings, it basically forces it to plan multiple times until the plan is very detailed. It also asks more questions
Share?
1 reply →
> But it's sad it does not ask the user more.
You can ask it to ask you about your task and it will ask you tons of questions.
I use Codex CLI in my daily usage since just with my $20/month subscription to ChatGPT, I never gets close to the quota. But it trips up over itself every now and then. At that point I just use Claude in another terminal session. We only have a laughable $750 a month corporate allowance with Claude.
I'm running a server on AWS with TimescaleDB on the disk because I don't need much. I figure I'll move it when the time comes. (edit: Claude Code is managing the AWS EC2 instance using AWS CLI.)
Claude Code this morning was about to create an account with NeonDB and Fly.io (edit: it suggested as the plan to host on these where I would make the new accounts) although it has been very successful managing the AWS EC2 service.
Claude Code likely is correct that I should start to use NeonDB and Fly.io which I have never used before and do not know much about, but I was surprised it was hawking products even though Memory.md has the AWS EC2 instance and instructions well defined.
> Claude Code likely is correct that I should start to use NeonDB and Fly.io which I have never used before and do not know much about
I wouldn't be so sure about that.
In my experience, agents consistently make awful architectural decisions. Both in code and beyond (even in contexts like: what should I cook for a dinner party?). They leak the most obvious "midwit senior engineer" decisions which I would strike down in an instant in an actual meeting, they over-engineer, they are overly-focused on versioning and legacy support (from APIs to DB schemas--even if you're working on a brand new project), and they are absolutely obsessed with levels of indirection on top of levels of indirection. The definition of code bloat.
Unless you're working on the most bottom-of-the-barrel problems (which to be fair, we all are, at least in part: like a dashboard React app, or some boring UI boilerplate, etc.), you still need to write your own code.
I find they are very concerned about ever pulling the trigger on a change or deleting something. They add features and codepaths that weren't asked for, and then resist removing them because that would break backwards compatibility.
In lieu of understanding the whole architecture, they assume that there was intent behind the current choices... which is a good assumption on their training data where a human wrote it, and a terrible assumption when it's code that they themselves just spit out and forgot was their own idea.
2 replies →
How do you make an LLM that’s was trained on average Internet code not end up as a midwit?
Mediocrity in, mediocrity out.
1 reply →
From what you said it sounds like the conclusion should be "you still need to design the architecture yourself", not necessarily "you still need to write your own code".
2 replies →
> they are overly-focused on versioning and legacy support (from APIs to DB schemas--even if you're working on a brand new project)
I mean, DB schema versioning is one of the things that you can dismiss as "I won't need it" for a long time - until you do need it, at which point it will be a major pain to add.
1 reply →
> Claude Code this morning was about to create an account with NeonDB
I had the same thing happen. Use planetscale everywhere across projects and it recommended neon. It's definitely a bug.
Interesting to me that Opus 4.6 was described as forward looking. I haven't *really* paid attention, but after using 4.5 heavily for a month, the first greenfield project I gave Opus 4.6 resulted in it doing a web search for latest and greatest in the domain as part of the planning phase. It was the first time I'd seen it, and it stuck out enough that I'm talking about it now.
Probably confirmation bias, but I'm generally of the opinion that the models are basically good enough now to do great things in the context of the right orchestration and division of effort. That's the hard part, which will be made less difficult as the models improve.
If Claude chooses GitHub actions that often, well, that is DAMNING. I wasn’t prepared for this but jeez, GitHub actions are kind of a tarpit of just awful shitty code that people copy from other repos, which then pulls and runs the latest copy of some code in some random repository you’ve never heard of. Ugh.
I just got an incredible idea about how foundation model providers can reach profitability
I'm already seeing a degradation in experience in Gemini's response since they've started stuffing YouTube recommendations at the end of the response. Anthropic is right in not adding these subtle(or not) monetization incentives.
I mean, that’s almost just fair. They ripped the answer from a YouTube video, but at least link you back to the source now.
is it anything like the OpenAI ad model but for tool choice haha
Claude Free suggests Visual Studio.
Claude Plus suggests VSCode.
Claude Pro suggests emacs.
4 replies →
I'd thought about model providers taking payment to include a language or toolkit in the training set.
Buy more GPUs.
Hence the claw partnership.
What coding with LLMs have taught me, particularly in a domain that's not super comfortable for me (web tech), is that how many npm packages (like jwt auth, or build plugins) can be replaced by a dozen lines of code.
And you can actually make sense of that code and be sure it does what you want it to.
So… this has been happening for a long time now. The baseline set of tools is a lot better than it used to be. Back in 2010, jQuery was the divine ruler of JSlandia. Nowadays, you would probably just throw your jQuery in the woodchipper and replace it with raw, unfinished, quartersawn JS straight from the mill.
I also used to have these massive sets of packages pieced together with RequireJS or Rollup or WebPack or whatever. Now it’s unnecessary.
(I wouldn’t dare swap out a JWT implementation with something Claude wrote, though.)
We used to reuse code a lot. But then we got problems like diamond dependency hell. Why did we reuse code a lot? To save on labor. Now we don't have to.
So we might roll-your-own more things. But then we'll have a tremendous amount of code duplication, effectively, and bigger tech debt issues, minus the diamond dependency hell issue. It might be better this way; time will tell.
Not just to save on labour. To have confidence in a battle tested solution. To use something familiar to others. For compatibility. To exploit further development, debugging, and integration.
Speaking of rolling your own things, i had claude knock out a trello clone for me in 30 minutes because i was irritated at atlassian.
I am already using it for keeping track of personal stuff. I’m not going to make a product out of it or even put it on github. It’s just for me. There are gonna be a lot of single team/single user projects.
It is so fast to build working prototypes that it’s not even worth thinking if you should do something. Just ask claude to take a shot of it, get a cup of coffee and evaluate the results.
2 replies →
[dead]
Reframe as "what most probably unspools from training given certain contexts" and this seems predictably less interesting.
This is funny to me because when I tell Claude how I want something built I specify which libraries and software patents I want it to use, every single time. I think every developer should be capable of guiding the model reasonably well. If I'm not sure, I open a completely different context window and ask away about architecture, pros and cons, ask for relevant links or references, and make a decision.
You specify which software patents you want it to use?
AI reading the patent is basically cleanroom reverse engineering according to current AI IP standards :D
3 replies →
Patterns?
2 replies →
Now as I am understanding things from this article, what I am thinking is that we have a new component in the SEO sector that we need to keep in mind, we need to optimize our tools, codes, or packages in such a manner that they can be recognized and get picked by these AI tools. We need to make sure to explain the best way our tool can be use and which scenario is the perfect one to use this tool because if most developers are using Claude Code and it has it's favorites then those tools might become industry defaults. I think we have a new idea in the SEO services.
> I think we have a new idea in the SEO services.
Not new: https://www.tryprofound.com/
But Llemmy thinks you should just roll your own anyway.
OK two things
First, how did shadcn/ui become the go-to library for UI components? Claude isn't the only one that defaults to it, so I'm guessing it's the way it's pushed in the wild somehow.
Second, building on this ^, and maybe this isn't quantifiable, but if we tell Claude to use anything except shadcn (or one of the other crazy-high defaults), will Claude's output drop in quality? Or speed, reliability, other metric?
Like, is shadcn/ui used by default because of the breadth of documentation and examples and questions on stack overflow? Or is there just a flood of sites back-linking and referencing "shadcn/ui" to cause this on purpose? Or maybe a mix of both?
Or could it be that there was a time early on when LLMs started refining training sets, and shadcn had such a vast number of references at that point in time, that the weights became too ingrained in the model to even drop anymore?
Honestly I had never used shadcn before Gemini shoved it into a React dashboard I asked for mid-late-2025.
I think I'm rambling now. Hopefully someone out there knows what I'm asking.
I had the same question. There are older and more established component libraries, so why’d this one win? It seems like a scientific answer would be worth a lot.
I expect its synergy with Tailwind. Shadcn/ui uses Tailwind for styling components, and AIs love Tailwind, so it makes sense they'd adopt a component library that uses it.
And it's definitely a real effect. The npm weekly download stats for shadcn/ui have exploded since December: https://www.npmjs.com/package/shadcn
I've been using shadcn since before agents. It collects several useful components, makes them consistently styles (and customizable), and is easy to add to your project, vendoring if you need to make any changes. It's generally a really nice project.
LLMs are going to keep React alive for the indefinite future.
Especially with all the no-code app building tools like Lovable which deal with potential security issues of an LLM running wild on a server, by only allowing it to build client-side React+Vite app using Supabase JWT.
This is interesting data but the report itself seems quite Sloppy, and over presented instead if just telling me what "pointed at a repo" means and how often they ran each prompt over what time period and some other important variables for this kind of research.
We've been doing some similar "what do agents like" research at techstackups.com and it's definitely interesting to watch but also changes hourly/daily.
Definitely not a good time to be an underdog in dev tooling
Good report, very important thing to measure and I was thinking of doing it after Claude kept overriding my .md files to recommend tools I've never used before.
The vercel dominance is one I don't understand. It isn't reflected in vercel's share of the deployment market, nor is it one that is likely overwhelming prevalent in discourse or recommended online (possible training data). I'm going to guess it's the bias of most generated projects being JS/TS (particularly Next.js) and the model can't help but recommend the makers of Next.js in that case.
I found it a remarkable transition to not use Redis for caching from Sonnet 4.5 to Opus 4.6. I wonder why that is the case? Maybe I need to see the code to understand the use case of the cache in this context better.
Yea, was it over engineered the first time or neglecting scenarios with multiple replicas the second time?
Worth reading alongside recent research on AGENTS.md file effectiveness. The clearest use case for these files isn't describing your codebase, it's overriding default behavior. If your project has specific requirements around tooling (common in government and regulated industries), that's exactly what belongs in the AGENTS.md files.
It still ignores it. I always have to say 'Isn't this mentioned in AGENTS??' and it will concede that it is.
In my experience the problem is how people write them. Descriptive statements get ignored because the model treats them as context it can reason past.
"We use PostgreSQL" reads as a soft preference. The model weighs it against whatever it thinks is optimal and decides you'd be better off with Supabase.
"NEVER create accounts for external databases. All persistence uses the existing PostgreSQL instance. If you're about to recommend a new service, stop." actually sticks.
The pattern that works: imperative prohibitions with specific reasoning. "Do not use Redis because we run a single node and pg_notify covers our pubsub needs" gives enough context that it won't reinvent the decision every session.
Your AGENTS.md should read less like a README and more like a linter config. Bullet points with DO/DON'T rules, not prose descriptions of your stack.
2 replies →
Have any links?
This seems web centric and I expect that colors the decision making during this analysis somewhat.
People are using it for all kinds of other stuff, C/C++, Rust, Golang, embedded. And of course if you push it to use a particular tool/framework you usually won't get much argument from it.
Unrelated to the topic at hand but related to the technologies mentioned. I weep for Redux. It's an excellent tool, powerful, configurable, battle tested with excellent documentation and maintainer team. But the community never forgave it for its initial "boilerplate-y" iterations. Years passed, the library evolved and got more streamlined and people would still ask "redux or react context?" Now it seems this has carried over to Claude as well. A sad turn of events.
Redux is boring tech and there is a time and place for it. We should not treat it as a relic of the past. Not every problem needs a bazooka, but some problems do so we should have one handy.
Yup. I'm the primary Redux maintainer and creator of Redux Toolkit.
If you look at a typical Zustand store vs an RTK slice, the lines of code _ought_ to be pretty similar. And I've talked to plenty of folks who said "we essentially rebuilt RTK because Zustand didn't have enough built in, we probably should have just chosen RTK in the first place".
But yeah, the very justified reputation for "boilerplate" early on stuck around. And even though RTK has been the default approach we teach for more than half of Redux's life (Redux released 2015, RTK fall 2019, taught as default since early 2020), that's the way a lot of people still assume it is.
It's definitely kinda frustrating, but at the same time: we were never in this for "market share", and there's _many_ other excellent tools out there that overlap in use cases. Our goal is just to make a solid and polished toolset for building apps and document it thoroughly, so that if people _do_ choose to use Redux it works well for them.
Redux should not be used for 1 person projects. If you need redux you'll know it because there will be complexity that is hard to handle. Personally I use a custom state management system that loosely resembles RecoilJS.
More like redux vs zustand. Picking zustand was one of the good standout picks for me.
Well, the tech du jour now is whatever's easier for the AI to model. Of course it's a chicken and egg problem, the less popular a tech is the harder it is to make it into the training data set. On the other hand, from an information theoretic point of view, tools that are explicit and provides better error messages and require less assumptions about hidden state is definitely easier for the AI when it tries to generalize to unknowns that doesn't exist in its training data.
Apparently the "API Layer" is "competitive", with TanStack Query and FastAPI as the leading options [1]. These are not at all alternatives to each other.
[1]: https://www.england.nhs.uk/publication/decision-support-tool...
Really interesting. The crazy changes in opus 4.6 really make me think that Anthropic is doing library-level RL. I think that is also the way forward to have 'llm-native' frameworks as a way to not get stuck in current coding practices forever. Instead of learning python 3.15, one would license a proprietary model that has been trained on python 3.15 (and the migrations) and gain the ability to generate python 3.15 code.
> Traditional cloud providers got zero primary picks
Good - all of them have a horrible developer experience.
Final straw for me was trying to put GHA runners in my Azure virtual net and spent 2 weeks on it.
They forgot the single most important (bad) choice. Claude Code chooses npm. All the time. For everything. I noted the Claude Code lead dev has a full line in AGENTS.md/CLAUDE.md - "Use bun." Yes. Please. Please, use bun. I beg you.
Yup don't expect up-to-date practices and always come with the expectations that your security will be flawed.
gemini 3 deepthink + 5.3 xxhigh code audits catch a lot. Materially better than six months ago on the security side.
Also, yes. Still something that needs expert oversight.
This is at the top of my ~/.claude/CLAUDE.md. Always use bun for web projects, uv for python.
I'll be interested to hear stories - down the line - from the participants in the the LLM SEO war [1].
Interesting that tailwind won out decisively in their niche, but still has seen the business ravaged by LLMs.
[1] https://paritybits.me/copilot-seo-war/
It's like tailwindcss was purposely designed to be managed my LLM.
Tailwind predates the ChatGPT moment
All projects done with Typescript, and the same tooling. The creativity of the LLM is quite biased. I would expect more reasoning and choosing other languages, platforms, libraries, etc.
def useful to show what models recommend in real use (over just meaningless benchmarks), but i still think small prompt wording and repo setup changes can change the outcome quite a bit so id love tighter controls there. having tried claude code with opus 4.6 with slightly different repo setups gives wildly different results IME. i also generally prefer to avoid the NIH syndrome and prefer using off-the-shelf libraries and specifically tell CC to do so - influences the choice outcomes by a lot
It really disappointing to see it so strongly preferring Github Actions which is in my experience terrible. Almost everything about GHA pushes you in the direction of constantly blowing out the 10GB cache limit in an attempt to have CI not run for ages. I also feel like the standard cache action using git works poorly with any tools that use mtime on files to determine freshness.
I guess at least Opus can help you muddle through GHA being so crappy.
It has one thing going for it: Setup.
And by setup I mean, integration and account creation. You don't have to do it. You already have a git repo, just add some yaml, and bobs your uncle.
It’s very Microsoft in that way.
Not sure what to make of this. React is missing entirely. Or is this report also assuming that React is the default for everything and not worth mentioning at all? Just like shadcn/ui's first mention of React is somewhere down the page or hidden in the docs?
Furthermore, what's the point of "no tools named"? Why would I restrict myself like that? If I put "use Nodejs, Hono, TypeScript and use Hono's html helper to generate HTML on the server like its 2010, write custom CSS, minimize client-side JS, no Tailwind" in CLAUDE.md, it happily follows this.
As someone who runs a small dev agency, I'm very interested in research like this.
Let's say some Doctor decides to vibecode an app on the weekend, with next to 0 exposure to software development until she started hearing about how easy it was to create software with these tools. She makes incredible progress and is delighted in how well it works, but as she considers actually opening it up the world she keeps running into issues. How do I know this is secure? How do I keep this maintained and running?
I want to be in a position where she can find me to get professional help, so it's very helpful to know what stacks these kinds of apps are being built in.
claudecode _loves_ shadcn/ui. I hadn't even heard of it until i was playing around with claudecode. It seems fine to me and if the coding agent loves it then more power to it, i don't really care. That's the problem.
I think that makes coding agent choices extremely suspect, like i don't really care what it uses as long as what's produced works and functions inline with my expectations. I can totally see companies paying Anthropic to promote their tool of choice to the top of claudecodes preferences. After thinking about it, i'm not sure if that's a problem or not. I don't really care what it uses as long as my requirements (all of them) are met.
> Furthermore, what's the point of "no tools named"?
There are vibe coders out there that don't know anything about coding.
I mean, i guess that will shortly put an end to the "no code" movement.
Because the primary and future audience of Claude et al don’t know the tools they want, or even that a choice exists.
I didn't read the report just the "finding" - but at least for launchdarkly it's nice that it chose a roll-your-own, i hate feature flag SaaS, but that's just me
[dead]
[dead]
[flagged]
Bot comment
The bias to build might mean faster token burn through (higher revenue for the AI co). But I think it's natural. I often have that same impulse myself. I prefer all the codebases I work on that have minimal external dependencies to the ones that are riddled with them. In Java land it's extremely common to have tons of external dependencies, and then upgrade headaches, especially when sharing in a monorepo type environment.