Comment by felarof

6 months ago

Thanks for asking - not a stupid question at all! I should have probably explained it at the top of my post.

By "agentic browser" we basically mean a browser with AI agents that can do web navigation tasks for you. So instead of you manually clicking around to reorder something on Amazon or fill out forms, the AI agent can actually navigate the site and do those tasks.

Not to pull a "why should I use Dropbox when I have rsync" but why should we use this over adding a Playwright MCP to Claude Desktop or similar?

Does having access to Chromium internals give you any super powers over connecting over the Chrome Devtools Protocol?

  • Yes, eventually we think there is more value of owning the entire stack than just be a MCP connector.

    Few ideas we were thinking of: integrating a small LLM, building MCP store into browser, building a more AI friendly DOM, etc.

    Even today, we use chrome's accessibility tree (a better representation of DOM for LLMs) which is not exposed via chrome extension APIs.

  • I would take the position of "why use this when I have eyes and hands and a brain?"

    • Why use any tool when you have bare hands bla bla...

      A good place to start is think about for example if you need to copy paste info from 100 websites to put into a spread sheet for example.

    • My guess is that this is for impatient people; people who think that the prescribed use cases are somehow necessary for their "workflows"; people who subscribe to terms like "cognitive friction" within the context of these use cases; people who are...sort of lazy.

      2 replies →