← Back to context

Comment by rapind

3 days ago

I think this is the practical way. The website owner (or rather the builder, since if you're running wordpress, we can assume MCP will be part of the package) is already responsible for the human interface across many devices, and also the search engine interface (robots.txt, sitemap.xml, metatags). Having a standard we can use to curate what the AI sees and how it can interact would be hugely beneficial.

There's space for both IMO. The more generic tool that figures it out on it's own, and the streamlined tool that accesses a site's guiderails. There's also the backend service of course which doesn't require the browser or UI, but as he describes this entails complexity around authentication and I would assume discoverability.

I agree with you that platforms like wordpress, shopify etc will likely ship MCP extensions to help with various use cases. Accompanied with a discovery standard similar to llms.txt, I think it will be beneficial too. My only argument is that platforms like this are also the most "templated" designs and it's already easy for AI to navigate them (since dom structure variance is small).

The bigger challenge I think is figuring out how to build MCPs easily for SaaS and other legacy portals. I see some push on the OpenAPI side of things which is promising but requires you to make significant changes to existing apps. Perhaps web frameworks (rails, next, laravel, etc) can agree on a standard.

  • > it's already easy for AI to navigate them (since dom structure variance is small).

    The premise of MCP-B is that it's in fact not easy to reliably navigate websites today with LLMs, if you're just relying on DOM traversal or computer vision.

    And when it comes to authenticated and read/write operations, I think you need the reliability and control that comes from something like MCP-B, rather than just trusting the LLM to figure it out.

    Both Wordpress and Shopify allow users to heavily customize their front-end, and therefore ship garbage HTML + CSS if they choose to (or don't know any better). I certainly wouldn't want to rely on LLMs parsing arbitrary HTML if I'm trying to automate a purchase or some other activity that involves trust and/or sensitive data.