Comment by throwup238

1 month ago

Anthropic has been flying by the seat of their pants for a while now and it shows across the board. From the terminal flashing bug that’s been around for months to the lack of support to instabilities in Claude mobile and Code for the web (I get 10-20% message failure rates on the former and 5-10% on CC for web).

They’re growing too fast and it’s bursting the seams of the company. If there’s ever a correction in the AI industry, I think that will all quickly come back to bite them. It’s like Claude Code is vibe-operating the entire company.

110 comments

throwup238

laserDinosaur 1 month ago

The Pro plan quota seems to be getting worse. I can get maybe 20-30 minutes work done before I hit my 4 hour quota. I found myself using it more just for the planning phase to get a little bit more time out of it, but yesterday I managed to ask it ONE question in plan mode (from a fresh quota window), and while it was thinking it ran out of quota. I'm assuming it probably pulled in a ton of references from my project automatically and blew out the token count. I find I get good answers from it when it does work, but it's getting very annoying to use.

(on the flip side, Codex seems like it's being SO efficient with the tokens it can be hard to understand its answers sometimes, it rarely includes files without you doing it manually, and often takes quite a few attempts to get the right answer because it's so strict what it's doing each iteration. But I never run out of quota!)

stareatgoats 1 month ago
Claude Code allegedly auto-includes the currently active file and often all visible tabs and sometimes neighboring files it thinks are 'related' - on every prompt.
The advice I got when scouring the internets was primarily to close everything except the file you’re editing and maybe one reference file (before asking Claude anything). For added effect add something like 'Only use the currently open file. Do not read or reference any other files' to the prompt.
I don't have any hard facts to back this up, but I'm sure going to try it myself tomorrow (when my weekly cap is lifted ...).
- sigseg1v 1 month ago
  
  What does "all visible tabs" mean in the context of Claude Code in a terminal window? Are you saying it's reading other terminals open on the system? Also how do you determine "currently active file"? It just greps files as needed.
  
  7 replies →
- idonotknowwhy 1 month ago
  
  Yes, it does exactly that. It also sends other prompts like generating 3 options to choose from, prefilling a reply like 'compile the code', etc. (I can confirm this because I connect CC to llama.cpp and use it with GLM-4.7. I see all these requests/prompts in the llama-server verbose log.)
  You can stop most of this with
  export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1
  And might as well disable telemetry, etc: export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
  I also noticed every time you start CC, it sends off > 10k tokens preparing the different agents. So try not to close / re-open it too often.
  source: https://code.claude.com/docs/en/settings
  
  1 reply →
aanet 1 month ago
^ THIS
I've run out of quota on my Pro plan so many times in the past 2-3 weeks. This seems to be a recent occurrence. And I'm not even that active. Just one project, execute in Plan > Develop > Test mode, just one terminal. That's it. I keep getting a quota reset every few hours.
What's happening @Anthropic ?? Anybody here who can answer??
- alexk6 1 month ago
  
  [BUG] Instantly hitting usage limits with Max subscription: https://github.com/anthropics/claude-code/issues/16157
  It's the most commented issue on their GitHub and it's basically ignored by Anthropic. Title mentions Max, but commenters report it for other plans too.
  
  6 replies →
- vbezhenar 1 month ago
  
  This whole API vs plan looks weird to me. Why not force everyone to use API? You pay for what you use, it's very simple. API should be the most honest way to monetize, right?
  This fixed subscription plan with some hardly specified quotas looks like they want to extract extra money from these users who pay $200 and don't use that value, at the same time preventing other users from going over $200. Like I understand that it might work at scale, but just feels a bit not fair to everyone?
  
  10 replies →
- MillionOClock 1 month ago
  
  I very recently (~ 1 week ago) subscribed to the Pro plan and was indeed surprised by how fast I reached my quota compared to say Codex with similar subscription tier. The UX is generally really cool with Claude Code, which left me with a bit of a bittersweet feeling of not even being able to truly explore all the possibilities since after just making basic planning and code changes I am already out of quota for experimenting with various ways of using subagents, testing background stuff etc.
  
  4 replies →
- heavyset_go 1 month ago
  
  Like a good dealer, they gave you a cheap/free hit and now you want more. This time you're gonna have to pay.
  
  1 reply →
- bmurphy1976 1 month ago
  
  I've been hitting the limit a lot lately as well. The worst part is I try to compact things and check my limits using the / commands and can't make heads or tails how much I actually have left. It's not clear at all.
  I've been using CC until I run out of credits and then switch to Cursor (my employer pays for both). I prefer Claude but I never hit any limits in Cursor.
  
  4 replies →
- fragmede 1 month ago
  
  How quickly do you also hit compaction when running? Also, if you open a new CC instance and run /context, what does it show for tools/memories/skills %age? And that's before we look at what you're actually doing. CC will add context to each prompt it thinks is necessary. So if you've got a few number of large files, (vs a large number of smaller files), at some level that'll contribute to the problem as well.
  Quota's basically a count of tokens, so if a new CC session starts with that relatively full, that could explain what's going on. Also, what language is this project in? If it's something noisy that uses up many tokens fast, even if you're using agents to preserve the context window in the main CC, those tokens still count against your quota so you'd still be hitting it awkwardly fast.
- genewitch 1 month ago
  
  sounds like the "thinking tokens" are a mechanism to extract more money from users?
  
  17 replies →
- behnamoh 1 month ago
  
  > I've run out of quota on my Pro plan so many times in the past 2-3 weeks.
  Waiting for Anthropic to somehow blame this on users again. "We investigated, turns out the reason was users used it too much".
ChicagoDave 1 month ago
I never run out of this mysterious quota thing. I close Claude Code at 10% context and restart.
I work for hours and it never says anything. No clue why you’re hitting this.
$230 pro max.
- fluidcruft 1 month ago
  
  Does closing claude code do something that running /clear does not?
  
  1 reply →
- yjtpesesu2 1 month ago
  
  Any clue why you might be a favored/favoured high value user?
  
  2 replies →
- croes 1 month ago
  
  Pro is 20x less than Max
nwatson 1 month ago
Self-hosted might be the way to go soon. I'm getting 2x Olares One boxes, each with an RTX 5090 GPU (NVIDIA 24GB VRAM), and a built-in ecosystem of AI apps, many of which should be useful, and Kubernetes + Docker will let me deploy whatever else I want. Presumably I will manage to host a good coding model and use Claude Code as the framework (or some other). There will be many good options out there soon.
- behnamoh 1 month ago
  
  > Self-hosted might be the way to go soon.
  As someone with 2x RTX Pro 6000 and a 512GB M3 Ultra, I have yet to find these machines usable for "agentic" tasks. Sure, they can be great chat bots, but agentic work involves huge context sent to the system. That already rules out the Mac Studio because it lacks tensor cores and it's painfully slow to process even relatively large CLAUDE.md files, let alone a big project.
  The RTX setup is much faster but can only support models ≤192GB, which severely limits its capabilities as you're limited to low Q GLM 4.7, GLM 4.7 Flash/Air/ GPT OSS 120b, etc.
- NitpickLawyer 1 month ago
  
  I've been using local LLMs since before chatgpt launched (gpt-j, gpt-neox for those that remember), and have tried all the promising models as they launch. While things are improving faster than I thought ~3 years ago, we're still not there in terms of 1-1 comparison with the SotA models. For "consumer" local at least.
  The best you can get today with consumer hardware is something like devstral2-small(24B) or qwen-coder30b(underwhelming) or glm-4.7-flash (promising but buggy atm). And you'll still need beefy workstations ~5-10k.
  If you want open-SotA you have to get hardware worth 80-100k to run the big boys (dsv3.2, glm4.7, minimax2.1, devstral2-123b, etc). It's ok for small office setups, but out of range for most local deployments (esp considering that the workstations need lots of power if you go 8x GPUs, even with something like 8x 6000pro @ 300w).
  
  1 reply →
- zen4ttitude 1 month ago
  
  I think this is the future as well, running locally, controlling the entire pipeline. I built acf on github using Claude among others. You essentially configure everything as you want, models, profiles, agents and RAG. It's free. I also built a marketplace to sell or give away to the community these pipeline enhancements. It's a project I wanted to do for a while and Claude was nice to me allowing it to happen. It's a work in progress but you have 100% control, locally. There is also a website for those not as technical where you can buy credits or plugin Claude or OpenAI APIs. Read the manifesto. I need help now and contributors.
thunfischtoast 1 month ago
I've used the Anthropic models mostly through Openrouter using aider. With so much buzz around Claude Code I wantes to try it out and thought that a subscription might be more cost efficient for me. I was kinda disappointed by how quickly I hit the quota limit. Claude Code gives me a lot more freedom than what aider can do, on the other side I have the feeling that pure coding tasks work better through aider or Roo Code. The API version is also much much faster that the subscription one.
- aja12 1 month ago
  
  Being in the same boat as you I switched to OpenCode with z.ai GLM 4.7 Pro plan and it's quite ok. Not as smart as Opus but smart enough for my needs, and the pricing is unbeatable
  
  2 replies →
rasmus1610 1 month ago

Very happy to see that I am not the only one. My pro subscription lasts maybe 30 minutes for the 5 hour limit. It is completely unusable and that's why I actually switched to OpenCode + GLM 4.7 for my personal projects and. It's not as clever as Opus 4.5 but it often gets the job done anyway

IgorPartola 1 month ago

You are giving me images from The Bug Short where the guy goes to investigate mortgages and knocks on some random person’s door to ask about a house/mortgage just to learn that it belongs to a dog. Imagine finding out that Anthropic employs no humans at all. Just an AI that has fired everyone and been working on its own releases and press releases since.

moring 1 month ago

"Just an AI that has fired everyone"
At least it did not turn against them physically... "get comfortable while I warm up the neurotoxin emitters"
smcin 1 month ago
'The Big Short' (2015)
- taneq 1 month ago
  
  So "The Bug Short" is still up for grabs if anyone wants to make a documentary about the end of the AI bubble? :D

sixtyj 1 month ago

They whistleblowed themselves that Claude Cowork was coded by Claude Code… :)

throwup238 1 month ago
You can tell they’re all vibe coded.
Claude iOS app, Claude on the web (including Claude Code on the web) and Claude Code are some of the buggiest tools I have ever had to use on a daily basis. I’m including monstrosities like Altium and Solidworks and Vivado in the mix - software that actually does real shit constrained by the laws of physics rather than slinging basic JSON and strings around over HTTP.
It’s an utter embarrassment to the field of software engineering that they can’t even beat a single nine of reliability in their consumer facing products and if it wasn’t for the advantage Opus has over other models, they’d be dead in the water.
- ilikeboobs 23 days ago
  
  The worst part is it's not getting better. It's getting even more unstable. They are the most unstable product, every 10 minutes is another bug, the same bugs that have existed the entire year I used it reported by hundreds of people. And every day is just, a new bug, never anything fixed. It just gets worse.
- cactusplant7374 1 month ago
  
  You're right.
  https://github.com/anthropics/claude-code/issues
  Codex has less but they also had quite a few outages in December. And I don't think Codex is as popular as Claude Code but that could change.
  
  2 replies →
- loopdoend 1 month ago
  
  Single nine reliability would be 90% uptime lol. For 99.9% we call it triple 9 reliability.
  
  2 replies →
- 0x500x79 1 month ago
  
  Even their status page (which are usually gamed) shows two 9s over the past 90 days.
- fizx 1 month ago
  
  hey, they have 9 8's
notsure2 1 month ago
Whistleblowed dog food.
- b00ty4breakfast 1 month ago
  
  normally you don't share your dog food when you find out it actually sucks.

threecheese 1 month ago

We’re an Anthropic enterprise customer, and somehow there’s a human developer of theirs on a call with us just about every week. Chatting, tips and tricks etc.

I think they are just focusing on where the dough is.

draw_down 1 month ago

[dead]

cyanydeez 1 month ago

I think your surmise is probably wrong. It's not that their growing to fast, it's that their service is cheaper than the actual cost of doing business.

Growth isn't a problem unless you dont actually pay for the cost of every user you subscribe. Uber, but for poorly profitable business models.

oblio 1 month ago
Interesting comparison, Uber.
> Since its founding in 2009, Uber has incurred a cumulative net loss of approximately $10.9 billion.
Now, Uber has become profitable, and will probably become a bit more profitable over time.
But except for speculators and probably a handful of early shareholders, Uber will have lost everyone else money for 20 years since its founding.
For comparison, Lyft, Didi, Grab, Bolt are in the same boat, most of them are barely turning profitable after 10+ years. Turns out taxis are a hard business, even when you ramp up the scale to 11. Though they might become profitable over the long term and we'll all get even worse and more abusive service, and probably more expensive than regular taxis would have been, 15-20 years from now.
I mean, we got some better mobile apps from taxi services, so there's that.
Oh, also a massive erosion of labor rights around the world.
- cyanydeez 1 month ago
  
  I suppose my comparison is that Uber eventually turned a profit and mostly displaced the competitors.
  I don't see the current investments turning a profit. Maybe the datacenters will, but most of AI is going to be washed out when somewhere, someone wants to take out their investment and the new Bernie Madoff can't find another sucker.

Bombthecat 1 month ago

Well, they vibe code almost every tool at least

tuhgdetzhh 1 month ago
Claude Code has accumulated so much technical dept (+emojis) that Claude Code can no longer code itself.
- behnamoh 1 month ago
  
  yeah, and it gets so clunky and laggy when the context grows. Anthropic just can't make software and yet they claim 90% of code will be written by AI by yesterday.
- wwweston 1 month ago
  
  What’s the opposite of bootstrapping? Stakebooting?
  
  4 replies →