Comment by listless

19 days ago

What I don’t understand in these posts is how exactly is the AI checking its work. That’s literally what I’m here for now. It doesn’t know how to log in to my iOS app using the simulator, or navigate to the firebase console and download a plist file.

Once we get to a spot where the AI can check its work and iterate, the loop is closed. But we are a long way off from that atm. Even for the web. I mean, have you tried the Playwright MCP server? Aside from being the slowest tool calls I have ever seen, the agent struggles mightily to figure out the simplest of navigation and interaction.

Yes yes Unit tests, but functional is the be all end all and until it can iterate and create its own functional test suite, I just don’t get it.