Comment by d4rkp4ttern

2 days ago

I recently found myself wanting to use Claude Code and Codex-CLI with local LLMs on my MacBook Pro M1 Max 64GB. This setup can make sense for cost/privacy reasons and for non-coding tasks like writing, summarization, q/a with your private notes etc.

I found the instructions for this scattered all over the place so I put together this guide to using Claude-Code/Codex-CLI with Qwen3-30B-A3B, 80B-A3B, Nemotron-Nano and GPT-OSS spun up with Llama-server:

https://github.com/pchalasani/claude-code-tools/blob/main/do...

Llama.cpp recently started supporting Anthropic’s messages API for some models, which makes it really straightforward to use Claude Code with these LLMs, without having to resort to say Claude-Code-Router (an excellent library), by just setting the ANTHROPIC_BASE_URL.