Comment by rickette
3 days ago
Does any of the LLM providers actually use llms.txt?
If I remember correctly this "standard" was setup by someone but without involvement of any of the major AI players.
3 days ago
Does any of the LLM providers actually use llms.txt?
If I remember correctly this "standard" was setup by someone but without involvement of any of the major AI players.
I can definitively say llms.txt is not used by any AI players. I run a blogging platform with around 80k blogs and /llms.txt is not requested by anything (other than humans checking to see if there's an llms.txt path).
All regular pages are aggressively scraped to the extent it's a problem I have to consistently manage, but not llms.txt.
Amazing, I didn't know.
So it get even stranger, I am the only one reading those /llms.txt ...
I'm seeing quite a bit of request for these on my work's GitBook documentation site.
But perhaps these are developers specifically targeting these pages to feed whatever LLM they are using.
How is a static blog being scraped a problem? Do you not use a CDN?
> a blogging platform with around 80k blogs
But nah, I'm sure OP doesn't know about CDNs.
Are all blogs static though?
1 reply →
> I can definitively say llms.txt is not used by any AI players.
OP clearly meant that the AI players are not reading and/or honouring llms.txt of other websites when scraping.
1 reply →
No, requesting "Accept: text/markdown" in the headers and returning markdown is the more agreed upon standard at this point.[0]
[0] - https://acceptmarkdown.com/
Now, it would be super cool to get markdown and zero javascript bundles…
If you want to see what that looks like, I one-shot a browser with Claude that does it[0]. Docs pages are early adopters to this[1][2], so that AI agents can better handle tasks.
[0] - https://github.com/solumos/md-browse
[1] - https://docs.stripe.com
[2] - https://vercel.com/docs
I just found out Cloudflare supports real-time html to md conversion [0]
- [0] https://blog.cloudflare.com/markdown-for-agents/#convert-htm...
This is interesting. I should start incorporating this -- it couldn't hurt to do both.
yes, they do.
anyone who's, even slightly, clued into how agents access documentation, has been making changes to their pages. ex: https://searchtxt-web.fly.dev/search?q=aws