Comment by otterley
3 hours ago
Wow, that was a lot of words.
> I think that is incredibly foolish a perspective. Rooted in old ridiculous slipshod biases, with no respect for users & their agency, and makes unsupported weak technical arguments that define away the possibility of APIs being anything but better.
Since this is a technical discussion, let's debate these based on their technological pros and cons, and avoid the characterizations, shall we?
> You don't state efficient at what, so I'll first argue you best case: energy efficient, least amount of computing done.
Yup.
> Both provide mechanistic access. If the user already had the browser open and is going for help, the difference is nearly nothing. It's different wire formats. We are talking the smallest tiniest peanuts of difference.
The difference may be "nearly nothing" at individual scale but not at global scale. The aggregate difference in energy and data transfer required to power a full browser experience vs. APIs is enormous. If it weren't true, Google, Amazon, and Meta wouldn't have spent nearly as much blood and treasure in optimizations, both in hardware and software, as they have over the last 25+ years. You can't just hand-wave this away. If you told Google and Meta that gRPC and Thrift were "peanuts of difference" and "trivial" they'd laugh in your face and show you the door. (You can always tell when someone's not an experienced engineer as soon as they bandy about the word "trivial.")
Again, browser-based interfaces are for humans. They change frequently, often at the whim of designers. As they evolve, agents must evolve with them. That sort of instability contributes to the resources needed to mechanize them. Compare against APIs, which often have stability guarantees, or at the very least, are only additive over time.
> Which WebMCP is a direct answer to, by allowing pages to offer a low friction access path that allows mechanistic control. Without the LLM having to "backstop" scrape and parse and puppeteer/playwright/devtools-protocol it's way through.
This I understand. But APIs are even more efficient still.
> If the agent is using an API, they have to craft a de-novo interface at every step of the process, either as text responses or MCP UI or other. The agent has to reinterpret and describe: it can't just show us what is, short of showing us OpenAPI definitions and json payloads.
I think you may be underestimating the extent to which this will need to happen with browser-based MCP connectivity as well.
Unfortunately I don't have the time to dive deep into the rest of your comment, as it's just too verbose and narrative-driven. If you'd like to make concise and concrete technological arguments, though, I'm open to that.
No, the characterization is very important. You've shown no connection to what's actually at stake, to the engagement patterns here, to the need for people to actually use agents in a way they understand, to the needs to work through & arrive together with your agent at an answer. We cant have a technical discussion until you actually show some engagement in the core topics, but you have been too busy raising frivolous objections to derail anyone thinking about the actual topic and technology.
Your proposal to use APIs is a grossly inefficient waste of LLM's time and energy, and far worse, a misuse of human attention that could be much better directed with the multiplayer/coop/peership of webmcp. You propose inventing brand new communication systems for every interaction, and haven't once considered the merits of leveraging the existing communication medium that users know. Rather than engage in WebMCP & what it brings, it's been trying to hide and confuse the matter & bury any discussion under a sea of objections, objections that don't even carry technical merit. If you want to actually reply to any of the interesting things rather than blocking and obstructing discussion, I'll happily re-engage.
I've found everything you have said to be radically damaging to understanding the problems that be, by vastly limiting consideration away from all interesting topics and raising only naysaying quibbles that don't address how users and agents would actually do work. Users and agents need to work together. That's simple, and your posts actively distract from what's unique and different here. I'm not going to accept another null response and then waste my time again, and it's sad that people have been steered away from thoughtful consideration like this.
If your argument has merit, we will see it win in the marketplace. If it doesn’t, then it will not. Simple as that. And I’m definitely not the only one who is looking for an explanation of why an agent-browser interface is the superior approach vis-a-vis the alternatives.
I’m not entirely sure what your angle is, but your tirade makes it sound like you’re emotionally invested in this (and potentially financially invested) and you’re frightened. A confident person doesn’t need all these histrionics.
I just really dislike the uselessness of people who naysay & dont engage! This poor world suffers SO MUCH from Brandolini's Law, from bad information being so easy to create. My heart is torn by bad engagement, by misdirection, away from the good and the interesting and the possible, and there's such an asymmetry that the truth and possibility face, so many ways for potential to be sapped and drained.
Hackers deserve better than such. There is a moral spiritual calling they ought feel to want to explore & think.
I do think WebMCP faces extremely long odds against success. It's incredibly unlikely to win. You started this by talking about companies wanting to do the wrong thing, by discussing how they hate giving users freedom to use the web as they want: WebMCP runs up against that problem. It only wins if a critical mass of users adopt it & can advocate for it, find it better enough & find enough voice to get it adopted anyways. That seems super unlikely. Your practical objection is most real, and part of the brutal badness of this reality. The odds of success only get far worse from there: I don't think a lot of users will have on-ramps to use this technology well. Very few users understand tool calling, very few will have interesting extensions or systems to make use of WebMCP. Especially with mobile browsers often not supporting extensions.
Once again I think you are just so off the mark on the other thing though: 'Let the market see' is wildly out of the spirit of a hackerly discussion. We ought assess for ourselves, be using this space to try to figure out what is good, and what we want to win, and why, on what merits. We ought be calibrating and pushing, trying to develop our thoughts. Humankind the toolmaker is meant to explore, to understand; that's why I dislike naysaying & non-engagement so much. It's against my spiritual values, against in my view the best parts of our nature.
Possibility and good is delicate. Seeing unengaged unthoughtful disregard of it does get to my heart.