Comment by 0xbadcafebee
1 day ago
Everybody's out here chasing SOTA, meanwhile I'm getting all my coding done with MiniMax M2.5 in multiple parallel sessions for $10/month and never running into limits.
1 day ago
Everybody's out here chasing SOTA, meanwhile I'm getting all my coding done with MiniMax M2.5 in multiple parallel sessions for $10/month and never running into limits.
For serious work, the difference between spending $10/month and $100/month is not even worth considering for most professional developers. There are exceptions like students and people in very low income countries, but I’m always confused by developers with in careers where six figure salaries are normal who are going cheap on tools.
I find even the SOTA models to be far away from trustworthy for anything beyond throwaway tasks. Supervising a less-than-SOTA model to save $10 to $100 per month is not attractive to me in the least.
I have been experimenting with self hosted models for smaller throwaway tasks a lot. It’s fun, but I’m not going to waste my time with it for the real work.
You need to supervise the model anyway, because you want that code to be long-term maintainable and defect free, and AI is nowhere near strong enough to guarantee that anytime soon. Using the latest Opus for literally everything is just a huge waste of effort.
Yes, but I find supervision much easier and faster with a strong model. It makes fewer dumb mistakes that I have to catch and correct, and it’ll follow my instructions more reliably.
1 reply →
Waste of effort... of Opus? If "Opus effort" is cheaper, than dev hours managing yourself more dumb/effective model, what is the point?
3 replies →
You don't magically get better results by spending 10x more on a model. If your prompt is crap and harness is crap, you get crap results, regardless of model. And if you run into limits, you aren't working at all.
Buying the most expensive circular saw doesn't get you the best woodworking, but it is the most expensive woodworking.
Not really true. Remember the prompt engineering craze a few years ago with crazy complex prompt composers (langchain) that don’t need to exist any more because the underlying model got so much better at understanding what the humans are actually asking for?
1 reply →
$100 / month will get you rate limited to much to rely on with the Claude plans. People still report getting rate limited on the $200 / plan.
Also not everyone wants to use Claude Code, so if they're paying API pricing it's more likely thousands of dollars a month. If you can get the same results by spending a fraction of that, why wouldn't you?
Managing context size and efficient token usage is a skill.
I have an Anthropic API key for work, and if I use sonnet/opus all day for agent coding, it ends up costing about ~$25.
I am going to need more cpu/ram to run multiple agents in parallel to spend much more than that.
I got rate limited within an hour on the $200 while working on a single feature.
That was the breaking point, I cancelled my subscription.
As it happens I had a low coding workload over the past two weeks so I've been noodling around in PI mostly with Gemini Flash api. I like it - I even agree it's a much better harness than CC. However, the lock in is real. Even without switching models which each have their own quirks, I expect my work speed to drop drastically for at least a week or two even if I was focused on it fully. But after the learning period I think pi will be faster. The danger of course is that CC is fairly on rails while with PI you could end up spending all your time tinkering with the harness.
And people report getting limited on the $200 plan is putting it very mildly.
You can't do any serious work on it without rationing your work and kneecapping your workflows, to the point where you design workflows around anthropic usage limit woodoo rather than what actually works.
Without this, I run into WEEKLY usage limits on $200 plan, working on a single codebase, one feature at a time, on just day 3.
1 reply →
For actually serious work, it's a stark difference if your proprietary and security relevant code is sent abroad to a foreign, possibly future hostile country, or is sent to some data center around the corner. It doesn't even need to be defence related.
AFAIK all these companies have SOTA or near-SOTA models available under enterprise licenses. AI companies are not interested in your secret sauce, they are trying to capture the SDLC wholesale.
3 replies →
Who are you paying $10/month? OpenRouter?
OpenCode Go, BlackBox, Chutes. https://codeberg.org/mutablecc/calculate-ai-cost/src/branch/...
I find Chutes very intriguing… has anyone used it? I found it when I started wondering what sort of $/performance I could get by simply renting GPU machines by the hour and running my own inference.
https://platform.minimax.io/docs/guides/pricing-token-plan
How do you use this? Do you use opencode or another frontend?
yep, OpenCode with a few plugins (context management, memory, a few MCPs)
[dead]