Comment by jimbokun

11 days ago

I use it to access Claude. So what's the difference?

2 comments

jimbokun

This stuff is a little messy and opaque, but the performance of the same model in different harnesses depends a lot on how context is managed. The last time I tried Copilot, it performed markedly worse for similar tasks compared to Claude Code. I suspect that Copilot was being very aggressive in compressing context to save on token cost, but I'm not 100% certain about this.

Also note that with Claude models, Copilot might allocate a different number of thinking tokens compared to Claude Code.

Things may have changed now compared to when I tried it out, these tools are in constant flux. In general I've found that harnesses created by the model providers (OpenAI/Codex CLI, Anthropic/Claude Code, Google/Gemini CLI) tend to be better than generalist harnesses (cheaper too, since you're not paying a middleman).

walthamstow 11 days ago

Different harnesses and agentic environments produce different results from the same model. Claude Code and Cursor are the best IME and Copilot is by far the worst.