← Back to context

Comment by steelbrain

16 days ago

> And on top of it, if you develop for native macOS, There’s no official tooling for visual verification. It’s like 95% of development is web and LLM providers care only about that.

Thinking out loud here, but you could make an application that's always running, always has screen sharing permissions, then exposes a lightweight HTTP endpoint on 127.0.0.1 that when read from, gives the latest frame to your agent as a PNG file.

Edit: Hmm, not sure that'd be sufficient, since you'd want to click-around as well.

Maybe a full-on macOS accessibility MCP server? Somebody should build that!