Comment by nikisweeting
4 days ago
Extensions are ok but they have limitations too, for example you cannot use extensions to automate other extensions.
We need the agent to be able to drive 1password, Privacy.com, etc. to request per-task credentials, change adblock settings, get 2fa codes, and more.
The holy grail really is CDP + control over browser launch flags + an extension bridge to get to the more ergonomic `chrome.*` APIs. We're also working on a custom Chromium fork.
Use an Electron app to spawn a child process to open a Chrome browser using the launch flags including `--remote-debugging-pipe` -- instead of exposing a websockets connection on port 9226 or something -- which, if coupled with `--user-data-dir=<path>`, will not show the security CDP bar warning at the top of the page as long as the user data directory is not the default user directory.
1. Get all the things you want.
2. Can create as many 'browser context' personas as you want
3. Use the Electron app renderer for UI to manage profiles, proxies for each profile, automate making gmail accounts for each profile, ect.
4. Forgot, it is very nice using the `--load-extension=/path/to/extension` flag to ship chrome extension files inside the Electron app bundle so that the launched browser will have a cool copilot side panel.
> Extensions are ok but they have limitations too, for example you cannot use extensions to automate other extensions.
5. If you know the extension ids it is easy to set up communication between the two. I already drive a Chrome extension using VSCode's core libraries and it would be a week or two of work to implement a light port of the VSCode host extension API but for a Chrome extension. Nonetheless, I'd rather have an Electron app to manage extensions the same way a VSCode does.
Yeah I started building this in my first week at the company haha: https://github.com/browser-use/desktop
Shipping a whole electron app is not a priority at the moment though, our revenue comes from cloud API users, and there we only need our custom chrome fork, no point messing with electron and extension bridges when we can add custom CDP commands to talk to `chrome.*` APIs directly.
I like the Chrome fork idea. I imagine in the next couple years, hardware companies, i.e. Apple, Lenovo, will start to ship extremely power local inference hardware as the models become sufficient which your browser will be able to leverage.
2 replies →
Yes an electron app helps tremendously, especially for managing lifecycle of tabs independently. We use that for creating our AI browser automations at Donobu (https://donobu.com). However, we do have the luxury of just focusing on a narrow AI QA use case vs. Browser-Use and others who need to support broad usecases in potentially adversarial environments.