← Back to context

Comment by grbsh

21 hours ago

Why not just use Claude by itself? Opus and Sonnet are great at producing pixel coordinates and tool usages from screenshots of UIs. Curious as to what your framework gives me over the plain base model.

Hey! To have a framework that can effectively control browser agents, you need systems to interact with the browser, but also pass relevant content from the page to the LLM. Our framework manages this agent loop in a way that enables flexible agentic execution that can mix with your own code - giving you control but in a convenient way. Claude and OpenAI computer use APIs/loops are slower, more expensive, and tailored for a limited set of desktop automation use cases rather than robust browser automations.