Comment by 0xbadcafebee

1 day ago

Everybody's out here chasing SOTA, meanwhile I'm getting all my coding done with MiniMax M2.5 in multiple parallel sessions for $10/month and never running into limits.

28 comments

0xbadcafebee

Aurornis 21 hours ago

For serious work, the difference between spending $10/month and $100/month is not even worth considering for most professional developers. There are exceptions like students and people in very low income countries, but I’m always confused by developers with in careers where six figure salaries are normal who are going cheap on tools.

I find even the SOTA models to be far away from trustworthy for anything beyond throwaway tasks. Supervising a less-than-SOTA model to save $10 to $100 per month is not attractive to me in the least.

I have been experimenting with self hosted models for smaller throwaway tasks a lot. It’s fun, but I’m not going to waste my time with it for the real work.

zozbot234 21 hours ago
You need to supervise the model anyway, because you want that code to be long-term maintainable and defect free, and AI is nowhere near strong enough to guarantee that anytime soon. Using the latest Opus for literally everything is just a huge waste of effort.
- senordevnyc 19 hours ago
  
  Yes, but I find supervision much easier and faster with a strong model. It makes fewer dumb mistakes that I have to catch and correct, and it’ll follow my instructions more reliably.
  
  1 reply →
- dandaka 21 hours ago
  
  Waste of effort... of Opus? If "Opus effort" is cheaper, than dev hours managing yourself more dumb/effective model, what is the point?
  
  3 replies →
0xbadcafebee 13 hours ago
You don't magically get better results by spending 10x more on a model. If your prompt is crap and harness is crap, you get crap results, regardless of model. And if you run into limits, you aren't working at all.
Buying the most expensive circular saw doesn't get you the best woodworking, but it is the most expensive woodworking.
- itake 12 hours ago
  
  Not really true. Remember the prompt engineering craze a few years ago with crazy complex prompt composers (langchain) that don’t need to exist any more because the underlying model got so much better at understanding what the humans are actually asking for?
  
  1 reply →
slopinthebag 17 hours ago
$100 / month will get you rate limited to much to rely on with the Claude plans. People still report getting rate limited on the $200 / plan.
Also not everyone wants to use Claude Code, so if they're paying API pricing it's more likely thousands of dollars a month. If you can get the same results by spending a fraction of that, why wouldn't you?
- chillfox 3 hours ago
  
  Managing context size and efficient token usage is a skill.
  I have an Anthropic API key for work, and if I use sonnet/opus all day for agent coding, it ends up costing about ~$25.
  I am going to need more cpu/ram to run multiple agents in parallel to spend much more than that.
- esperent 11 hours ago
  
  I got rate limited within an hour on the $200 while working on a single feature.
  That was the breaking point, I cancelled my subscription.
  As it happens I had a low coding workload over the past two weeks so I've been noodling around in PI mostly with Gemini Flash api. I like it - I even agree it's a much better harness than CC. However, the lock in is real. Even without switching models which each have their own quirks, I expect my work speed to drop drastically for at least a week or two even if I was focused on it fully. But after the learning period I think pi will be faster. The danger of course is that CC is fairly on rails while with PI you could end up spending all your time tinkering with the harness.
- gck1 14 hours ago
  
  And people report getting limited on the $200 plan is putting it very mildly.
  You can't do any serious work on it without rationing your work and kneecapping your workflows, to the point where you design workflows around anthropic usage limit woodoo rather than what actually works.
  Without this, I run into WEEKLY usage limits on $200 plan, working on a single codebase, one feature at a time, on just day 3.
  
  1 reply →
AnonymousPlanet 21 hours ago
For actually serious work, it's a stark difference if your proprietary and security relevant code is sent abroad to a foreign, possibly future hostile country, or is sent to some data center around the corner. It doesn't even need to be defence related.
- flatline 20 hours ago
  
  AFAIK all these companies have SOTA or near-SOTA models available under enterprise licenses. AI companies are not interested in your secret sauce, they are trying to capture the SDLC wholesale.
  
  3 replies →

chatmasta 20 hours ago

Who are you paying $10/month? OpenRouter?

0xbadcafebee 13 hours ago
OpenCode Go, BlackBox, Chutes. https://codeberg.org/mutablecc/calculate-ai-cost/src/branch/...
- chatmasta 13 hours ago
  
  I find Chutes very intriguing… has anyone used it? I found it when I started wondering what sort of $/performance I could get by simply renting GPU machines by the hour and running my own inference.
tgrowazay 19 hours ago

https://platform.minimax.io/docs/guides/pricing-token-plan

xutopia 19 hours ago

How do you use this? Do you use opencode or another frontend?

0xbadcafebee 13 hours ago

yep, OpenCode with a few plugins (context management, memory, a few MCPs)

fnetisma 21 hours ago

[dead]