Comment by reddalo

13 hours ago

I wish people would follow this, instead of coming up with new standards in the root namespace. "llms.txt" [1] comes to mind, for example.

Let's stop polluting the root of a domain!

13 comments

reddalo

LLMs.txt is also nonsense since it isn't adopted by any of the major AI players.

networked 11 hours ago
Google has recently added `llms.txt` to Chrome's Lighthouse check for agentic browsing (https://searchengineland.com/google-llms-txt-chrome-lighthou...), so adoption may be coming. Admittedly, I put more faith in
<link rel="alternate" type="text/markdown" href="https://example.com/foo.md" title="Markdown version of the <Foo> page">
that I copied from Gwern.net. This convention is discoverable (just read the HTML) and naturally adapts to any website size and structure.
I have created an `llms.txt` for my website anyhow. I use a fixed LLM prompt to generate it from the internal links in `index.md`.
- iamacyborg 11 hours ago
  
  Giving a markdown version of a page seems like an interesting choice instead of just embedding a schema marked up one
  
  6 replies →
dspillett 11 hours ago
The same could be said of robots.txt
And anything else that might tell them not to access something.
- reddalo 8 hours ago
  
  robots.txt predates the modern web though
  
  1 reply →
pfannl 8 hours ago

To be fair, "not adopted by any major AI player" is probably the most web-standard-compliant phase of a new web standard.