← Back to context

Comment by horacemorace

5 days ago

I was trying to get Claude code to work with llama.cpp but could never figure out anything functional. It always insisted on a phone home login for first time setup. In cline I’m getting better results with glm-4.7-flash than with qwen3-coder:30b

~/.claude.json with {"hasCompletedOnboarding":true} is the key, then ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN work as expected

Curious what llama-server flags you used. On my M1 Max 64GB MacBook I tried it in Claude Code (which has a 25K system message) and I get 3 tps.

But with Qwen3-30B-A3B I get 20 tps in CC.