Comment by juanre
19 hours ago
This sounds very plausible. Arguably MCPs are already a step in that direction: give the LLMs a way to use services that is text-based and easy for them. Agents that look at your screen and click on menus are a cool but clumsy and very expensive intermediate step.
When I use telegram to talk to the OpenClaw instance in my spare Mac I am already choosing a new interface, over whatever was built by the designers of the apps it is using. Why keep the human-facing version as is? Why not make an agent-first interface (which will not involve having to "see" windows), and make a validation interface for the human minder?
I've thought about this a lot, even before LLMs - so much about the modern web especially is so slow and bloated. I want the airline to give me an API to query flights and one to book, I don't need 400 nested DIVs of styled components vomited at me every pageview. But everyone considers API access to be "commercial" and are afraid someone else will make money without them getting an extra cut.