Comment by vessenes
3 months ago
Just tried it. really cool, and a fun tech demo with rcli. I filed a bug report; not everything is loading properly when installed via homebrew.
Quick request: unsloth quants; bit per bit usually better. Or more generally UI for huggingface model selections. I understand you won't be able to serve everything, but I want to mix and match!
Also - grounding:
"open safari" (safari opens, voice says: "I opened safari") "navigate to google.com in safari" (nothing happens, voice says: "I navigated to google.com")
Anyway, really fun.
Thanks for trying it and for filing the bug, we're looking into the homebrew install issue.
On unsloth quants: agreed, they're consistently better bit-for-bit. Adding broader quantization format support (including unsloth's approach) is on the roadmap. Right now MetalRT works with MLX 4-bit files and GGUF Q4_K_M, we want to expand that.
On the grounding issue ("navigate to google.com" not actually navigating): you're right, that's a gap. The "open_url" action exists but the LLM doesn't always route to it correctly, especially with compound commands. Small models (0.6B-1.2B) have limited tool-calling accuracy, upgrading to Qwen3.5 4B via rcli upgrade-llm helps significantly. We're also improving the action routing prompts.
Appreciate the detailed feedback, this is exactly what we need.
> "open safari" (safari opens, voice says: "I opened safari") "navigate to google.com in safari" (nothing happens, voice says: "I navigated to google.com")
So you’re describing a core broken feature. Application breaking at easiest test.
Fair criticism. The action executed on the LLM side but didn't translate to the correct macOS action, the model hallucinated success instead of routing to the open_url tool.
This is a known limitation with small LLMs (0.6B-1.2B) doing tool calling. They sometimes confuse "I know what you want" with "I did it." Upgrading to a larger model improves tool-calling accuracy significantly.
We're also working on verification, having the pipeline confirm the action actually succeeded before reporting back. Thats a fair expectation and we should meet it.
> This is a known limitation with small LLMs (0.6B-1.2B) doing tool calling.
To me this is this nut to crack, wrt tool calling and locally running inference. This seems like a really cool project and I'm going to dive around a little later but if it's hallucinating for something as basic as this makes me think it's more of POC stage right now (to echo other sentiment here).
2 replies →
How did you try it? You said on github it doesn't work.
They said it didn't work installed from homebrew, so I assume they went back and did the curl | bash install option
This option didn't work either. I tried it. Also, the install script… installs Brew. So at the end, it's the same?
3 replies →
It loads after those errors. Tap space and talk to it.