← Back to context

Comment by dataviz1000

4 days ago

I like the Chrome fork idea. I imagine in the next couple years, hardware companies, i.e. Apple, Lenovo, will start to ship extremely power local inference hardware as the models become sufficient which your browser will be able to leverage.

I built a prototype using native messaging (the same way apps password managers interact with browsers and drive actions with pure js).

I have a lot of actions done but not full there yet. Essentially the goal is to use a cli or an external desktop app to drive your already logged‑in Chrome profile without navigator.webdriver, or enabling --remote‑debugging‑port. In all my testing never got flagged with captcha/bot protect. The cli can interact with LLMs, local file system(despite opfs this is easier).

CLI(host app) <-> Native messaging daemon <-> Chrome extension

Extenion just executes pure js for extraction, navigation. Ive gotten basic google searching, image downloading working. Looking at complex form interactions.

  • native messaging is a huge headache to set up reliably across all the OSs and (often headless) chrome setups in our experience, that's why we've avoided it.

    Just using some remote endpoint message bus service is an easier solution, or something like ElectricSQL/RxDB/Replicache/etc.

    We also can't really use in-page JS for much because it's easily detected by bot-blockers, though isolated worlds help in CDP.