← Back to context

Comment by kanemcgrath

17 hours ago

I liked copilot because I didn't have to think about tokens. I get hung up when having to think about the price of things, and its hard to think about the project at the same time I got to think about token usage like a gas bill. The usage system had its own issues, but having a set amount of requests was a very comfortable way to use a paid AI service.

Sounds like you're a candidate for a local model. It's kinda nice not caring what the token count means except as to compaction.

  • Not paying per token? Not sending my code to someone else's servers for inference? That's the stuff of sweet dreams for a stingy, paranoid solopreneur like me.

    If I could run a local model comparable to even Sonnet 4.6 without shelling out $50K in hardware, I'd do it in a heartbeat. But all I have is a 32 GB of RAM and an old RTX 4080.

    Or am I not up to speed? Are there decent coding models that can run on dev laptops? Not that that's what you were suggesting by recommending a local model, necessarily; just curious.

  • I do love using local models when I can, but qwen-35B is the best model I can run, and while its an insanely good local model, it does not compare to the big ones.