← Back to context

Comment by api

5 hours ago

It's great for testing and QA automation for UIs. It's also possibly good for the vision impaired.

UI QA only works well if your model plausibly matches the average user behavior and/or real-world edge cases. These models are far from that, and they are much less random than you'd like them to be for fuzzing (mode collapse).

  • It doesn't need to be that kind of QA. Even just a basic "I want the AI to build the beginnings of a GUI app for me" will work much better if the AI can see the output of its work and iterate on it. Similar if you want the AI to fix a GUI bug—much better if you can show it the the bug and tell it how to test to see when it's gone.

    • the LLM does not require computer use to see the GUI and, again, that's a pretty niche use and not what Computer Use is being marketed for