Comment by root_axis
6 days ago
The problem with LLMs for code is that they are still way too slow and expensive to be generally practical for non-trivial software projects. I'm not saying that they aren't useful, they are excellent at filling out narrow code units that don't require a lot of context and can be quickly or automatically verified to be correct. You will save a lot of time using them this way.
On the other hand, if you slip up and give it too much to chew on or just roll bad RNG, it will spin itself into a loop attempting many variations of crap, erasing and trying over, but never actually coming closer to a correct solution, eventually repeating obviously incorrect solutions over and over again that should have been precluded based on feedback from the previous failed solutions. If you're using a SOTA model, you can easily rack up $5 or more on a single task if you give it more than 30 minutes of leeway to work it out. Sure, you could use a cheaper model, but all that does is make the fundamental problem worse - i.e. you're spending money but not actually getting any closer to completed work.
Yes, the models are getting smarter and more efficient, but we're still at least a decade away from being able to run useful models at practical speeds locally. Aggressively quantized 70b models simply can't cut it, and even then, you need something like 10k tps to start building LLM tools that can overcome the LLM's lack of reasoning skills through brute force guess and check techniques.
Perhaps some of the AI skeptics are a bit too harsh, but they're certainly not crazy in the context of breathless hype.
What are you talking about? It’s $20/month.
OP is talking about API usage, and yes if you let Claude 4 Opus crunch on a long task it'll eat tokens/money like nothing else.
Those $20/month plans usually come with throttling, or other type of service degradation once you max out their allotment.