Comment by frabonacci

6 months ago

Thanks for trying out c/ua! We still recommend pairing the Omni loop configuration with a more capable VLM, such as Qwen2.5-VL 32B, or using a cloud LLM provider like Sonnet 3.7 or OpenAI GPT-4.1. While we believe that in the coming months we'll see better-performing quantized models that require less memory for local inference, truth is we're not quite there yet.

Stay tuned - we're also releasing support for UI-Tars-1.5 7B this week! It offers excellent speed and accuracy, and best of all, it doesn't require bounding box detection (Omni) since it's a pixel-native model.

2 comments

frabonacci

rahimnathwani 6 months ago

Thanks. I'll try that, but right now it's not working at all, i.e. cua can't interact with the VM at all. That's a not a model issue.

frabonacci 6 months ago

If you're running Cua from VS Code or Cursor, have you checked out this issue? https://github.com/trycua/cua/issues/61
Feel free to ping me on Discord (I'm francesco there) - happy to hop on a quick call to help debug: https://discord.com/invite/mVnXXpdE85