Comment by pookieinc

5 months ago

The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.

Do you foresee these limitations increasing anytime soon?

Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.

49 comments

pookieinc

eschluntz 5 months ago

We are definitely aware of this (and working on it for the web UI), and that's why Claude Code goes directly through the API!

smallerfish 5 months ago
I'm sure many of us would gladly pay more to get 3-5x the limit.
And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.
I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).
- justinbaker84 5 months ago
  
  This is the main point I always want to communicate to the teams building foundation models.
  A lot of people just want the ability to pay more in order to get more.
  I would gladly pay 10x more to get relatively modest increases in performance. That is how important the intelligence is.
  
  4 replies →
- stribbon 5 months ago
  
  [flagged]
raylad 5 months ago
The problem with the API is that it, as it says in the documentation, could cost $100/hr.
I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.
- rudedogg 5 months ago
  
  > The problem with the API is that it, as it says in the documentation, could cost $100/hr.
  I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.
  +1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.
  I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).
  
  1 reply →
sealthedeal 5 months ago
I haven't been able to find ClaudeCLI for pubic access yet. Would love to use.
- eschluntz 5 months ago
  
  >>> npm install -g @anthropic-ai/claude-code
  >>> claude
- kkarpkkarp 5 months ago
  
  see https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

mianos 5 months ago

I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.

It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.

zaptrem 5 months ago
I have substantial usage via their API using LibreChat and have never run into rate limits. Why not just use that?
- yarbas89 5 months ago
  
  That sounds more expensive than the £18/mo Claude Pro costs?
  
  1 reply →
divan 5 months ago

Same.

punkpeye 5 months ago

If you are open to alternatives, try https://glama.ai/gateway

We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.

I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.

https://glama.ai/models/claude-3-7-sonnet-20250219

The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.

Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.

I would love feedback if you are going to give a try frank@glama.ai

airstrike 5 months ago
The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...
- punkpeye 5 months ago
  
  The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.
  The value proposition of Glama is that it combines UI and API.
  While everyone focuses on either one or the other, I've been splitting my time equally working on both.
  Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.
  You have access to:
  * Switch models between OpenAI/Anthropic, etc.
  * Side-by-side conversations
  * Full-text search of all your conversations
  * Integration of LaTeX, Mermaid, rich-text editing
  * Vision (uploading images)
  * Response personalizations
  * MCP
  * Every action has a shortcut via cmd+k (ctrl+k)
  
  5 replies →
thrdbndndn 5 months ago

Just tried it, is there a reason why the webUI is so slow?
Try to delete (close) the panel on the right on a side-by-side view. It took a good second to actually close. Creating one isn't much faster.
This is unbearably slow, to be blurt.
tesch1 5 months ago

Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?
cmdtab 5 months ago
Do you have deepseek r1 support? I need it for a current product I’m working on.
- punkpeye 5 months ago
  
  Indeed we do https://glama.ai/models/deepseek-r1
  It is provided by DeepSeek and Avian.
  I am also midway of enabling a third-provider (Nebius).
  You can see all models/providers over at https://glama.ai/models
  As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.
  The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.
  The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.
  
  2 replies →
- pclmulqdq 5 months ago
  
  They are just selling a frontend wrapper on other people's services, so if someone else offers deepseek, I'm sure they will integrate it.
Daniel_Van_Zant 5 months ago

I see Cohere, is there any support for in-line citations like you can get with their first party API?

clangfan 5 months ago

this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill

eschluntz 5 months ago
It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!
You can track costs in a few ways and set spend limits to avoid surprises: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...
- danw1979 5 months ago
  
  What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).
  i.e I'd like my chat and API usage to be all included under a flat-rate subscription.
  Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?
  Honestly, Claude is so good, just please take my money and make it easy to do the above !
  
  7 replies →
- mindok 5 months ago
  
  Which is theoretically great, but if anyone can get an Aussie credit card to work, please let me know.
  
  2 replies →
edmundsauto 5 months ago

I use AnythingLLM so you can still have a "Projects" like RAG.