Comment by HauntingPin
15 hours ago
How is any of what you wrote relevant? People aren't using Claude for the first time and hitting rate limits. They've been using Claude for months, at the very least, and they're hitting rate limits without significant changes to how they prompt.
> People need to understand a few things: vague questions make the models roam endlessly “exploring” dead ends.
> If people were considerably more willing to aggressively prune their context and scope tasks well, they could get a lot more done with it
If this were the problem, people would've encountered this when they started using Claude. The problem is not that they can't get anything done. It's being able to get things done for months, but suddenly hitting rate limits way too easily and response quality being clearly degraded, so they can't get things done that used to be possible.
I think in this case, we probably have different experiences that shape how we see some things differently: I see many (very smart) people doing certain things that are not optimal (eg: copy-paste entire files instead of referencing them or tell claude at every message to "read CLAUDE.md and follow its instructions precisely") which can lead to a lot of token waste. If certain system prompts were tweaked internally or some models now read more files than before, keeping these "inneficient prompts" will make limits exhaust faster. Sub-agents or this new agent teams feature didn't exist until a few months ago: that alone eats A LOT of tokens, not intended for this pre-paid API usage, etc.
The ecosystem is evolving super quickly so, our own experiences and workflows must keep adapting with it to experiment, find limitations and arrive at the "tightest possible scope" that still allows you to get things done, because it is possible.
Another example: pre-paid monthly subscription aggregates usage towards web and Claude Code, for eg. So if you're checking for holiday itineraries over your lunch break, then decide to sit down and ask a team of agents to refactor a giant codebase with hundreds or thousands of files, context will be exhuasted quickly, etc, etc.
I see this "context economy" as a new way of managing your "mental models": every token counts, and every token must bear its weight for the task at hand, otherwise, I'm "wasting budget". I am also still learning how to operate in this new way of doing things, and, while there have been genuine issues with Claude Code, not every single issue that people encounter is an upstream problem.
This is literally victim blaming. When people haven't been having issues until now, why is it their fault? Anthropic is providing a paid service to paying users. It's not acceptable that they degrade our experience to save some money and it's not acceptable to blame everybody else who didn't cause the issue.
In the end, Anthropic is a company and needs to make money, my best bet is that even those of us who pay 100/mo to use Claude Code are costing Anthropic money, besides all the rest they’re burning on inference.
Again, I agree with you and the service should be at least reliable but to be completely fair, if I had to bet, the amount of usage people get for 100/mo is probably only balanced out by the corporate/entreprise customers paying their bill to Anthropic via API usage.
If we look at it through this lens, this limits are not surprising at all, except maybe on how generous they are/were. It’s pretty obvious that they want to force people to pay as they go….