Comment by vladgur

3 months ago

I’m curious how far are we from giving coding agents access to these desktop agents so that when we are using say Claude Code to build a native desktop app, the coding agents can actually see and act on the desktop UI that it is building

7 comments

vladgur

drphilwinder 3 months ago

This is a great point. Not that far. We also snapshot the desktop for "slow" non-streaming updates to the UI. We could push these into Claude itself to act on or describe or whatever.

talking_penguin 3 months ago

The streaming architecture is designed for exactly this - we originally built it for autonomous agents that need persistent development environments. The missing pieces are mostly integration work (mapping Claude's tool use format to our desktop APIs). Would be very interested to hear if others are working on similar integrations - the combination of LLM coding agents + real desktop environments feels like it unlocks a lot of interesting workflows.

_ea1k 3 months ago

For web apps, I'd guess that many of us already do that via Playwright or other MCPs. I'd bet there are people doing something similar with desktop apps too.

vladgur 3 months ago
unfortunately there is nothing like that for the desktop apps.
Even mobile app devs have https://github.com/mobile-next
- _ea1k 3 months ago
  
  Why not https://github.com/hrrrsn/mcp-vnc ?

lewq 3 months ago

That's the next move :-D

ErikBjare 3 months ago

I did that a year ago, I imagine it would work better today.