Comment by wrxd
25 days ago
I’m at a FAANG and we have $300/day token quota. Personally I don’t use that much of it but management is pushing really hard for it. “the quota has been raised for a reason, use it”. Any task: “have you tried working on it with Claude?”. Every meeting “now engineer x and y will show you what he did with AI”.
It’s not all useless but most of the days I think I would be more productive if some processes were streamlined rather than if I had to throw tokens at them and still fail.
Of all the showcases I’ve seen the best are the ones written by people assuming that the token bonanza will not last so they used AI to build tools they wished they had. AI used to build the tool but by no means used by the tool, so if/when token quota gets reduced we still have a functional tool.
300 a day?? 7K dollars a month? No wonder they need to lay people off!
At Nvidia, we have no limit for Anthropic or Open AI models (for now) and are heavily encouraged to use them as much as possible.
The fact that they've started promoting using the Caveman mode tells me that the unlimited usage policy is taking its toll.
3 replies →
Please don’t tell me you’re writing RTL
1 reply →
I by myself use now more than 15 accounts combined of all providers + API as well for external providers, more than 50K$ equivalent a month in API tokens, my team is doing the same thing, it's not really that much once you figured out the real automation loops and workflows, solving 300 issues a day with guarantees is common.
I feel that a lot of users are still stuck on Claude code or tools like this and don't really have a real argument about why they are even following the thread at all, everything has to be async for serious automation, you shouldn't even be seeing what Claude or any other model is replying (everything has to be digested with another model to increase relevancy and accuracy of the message so you can read faster (like a bot)), it's irrelevant, only human in the loop when a decision must be made, the rest has to be loops with all model, typical e2e, regression, computer use test, video into frames into all model loop and so-on.
That's interesting. What is the input into the process? Don't you need a PRD or a requirement doc to start with?
> No wonder they need to lay people off!
He clearly works at Apple, and they aren't laying people off.
I'm not aware of a limit in my current role. There is, however, a leaderboard.
Well, presumably (hopefully) they aren't expected to work weekends.
No days off for the agents.
Yes, the cost of AI is a big contributing factor.
The unsubsidised costs can't be revealed soon enough.
That’s funny I’ve been doing that too
Trying to crank out all the tools I never had time to build because I think we’re going to get cut off eventually
This seems seductive, but how do you get past the wall of "fixing XYZ or adding convenience ABC isn't on our pre-planned roadmap" so you can't get buy in from people who have to sign-off or deploy stuff?
Maybe that type of awkwardness is specific to my firm, but that's sort of what killed my drive to try to do that. We used to have one day every second week for that sort of work, but since it was scattered around, the tasks ended up disappearing-- nobody reviewed them and they didn't get merged.
So now they're trying to do a week-long internal hackathon to recover that vision, but I feel like that's going to produce a handful of big-bang ideas and not the 25 tiny tools that would actually streamline things.
Same. I've used it for debugging failed canary tests which required scripts and very specific knowledge on the canary platform that I wouldnt of ever spent time on.
I also have scripts to fetch specific database assets and forward them to slack channels so I can easily share them with a group rather than manually running a query and generating them.
I had a theory about improving a product. I asked it to build an offline simulation setup to try various implementations. The results were a bit fishy but i decided to give it a try and A/B testing is showing similar results.
And now im vibecoding a locally hosted dashboard. This one is less useful for anything specific, and more of a minor quality of life improvement, but its fun to just vibe code and see changes happen occasionally. Its not a critical thing.
I find it very useful for debugging tasks like that but it always ends up costing me like $3 despite doing incredible work. And then one of the other engineers at my company will rack up like $200 in tokens in one day producing tens of thousands of SLOC and we end up actually shipping about the same stuff. Sometimes I wonder if it's bad agent use discipline (just pointing it at massive codebases and having it read it all from scratch each time) and sometimes I wonder if they're just using it for personal projects. Because none of that code seems to land in prod, and I've found that cranking out 10s of thousands of SLOCs at a time is a recipe for a mess.
2 replies →
I don't think we will. I think this level of token cost/availability will trend cheaper and faster, long term. These companies that spent too big and too fast might try to limit it and raise the prices and they might be temporarily successful but they'll very quickly be taken over if they keep doing it.
May I ask what tools did you make so far? And what is on your roadmap?
Not OP, but a very simple example: I use AI to review my work before opening a PR for my colleagues to review. I ask it to review the commits in my branch. Instead of consuming tokens just to instruct it how to use git operations and other tools to find the commits since the base commit, I asked AI to create a little bash script to make patch files commit1.patch, commit2.patch, commit3.patch, etc, for all the commits in my branch since the base commit. Now I just use this script to prepare the context of commits to review.
I feel like an imposter here, I’m definitely not using AI as much as it seems everyone is :( I can’t imagine using hundreds of dollars of tokens a day. But maybe this little tip for reviews might be helpful to someone.
2 replies →
Not op, made a tool to convert Microsoft OneNote notes to Obsidian canvas and Markdown. First it used a python lib which was too limiting. Then it used windows API to plug into OneNote and read the doc in its original XML form. That made the conversion correct and fully featured.
Not OP, but I've been focusing on linting and automation.
Custom lint rules to encode best practices that previously relied on astute/alert code reviewer to call attention to. This is handy not just for humans but it steers the bots too. Or turning on some existing rule that required a big cleanup/migration to be compliant with. Now I just throw an LLM at it, since they're often laborious but mechanical changes. Which is the sweet spot for an LLM.
Also automating everything I can. That annoying release process that everyone hates but wasn't quite long/arduous enough to justify the time before? It's now automated. GitHub workflows for all the things.
This kind of stuff will forever be useful, even if the bottom drops out and the bubble bursts. And none of it is reliant on AI to run
"AI used to build the tool but by no means used by the tool" is a really good way to put it. Feels like the smart play right now is treating these credits as temporary subsidy and building stuff that still works when the bill comes due.
Seems like people are spending more time building tools than doing actual work. Lots of overlap too
In all fairness, doing actual work in this current slice of time is not what componies are prioritizing as of now.
It is fairly easy to tokenmax by having and inefficient automation set up.
Not something I would do personally. But it is surprisingly easy to set up a claw that eats half of your token budget in a meaningless "research" task. Set it up as a cron job and you will soon be promoted for being an AI visionary
Innovation signalling.
> $300/day token quota
Are companies using per-token billing? Why - is there some reason they can’t buy the $200/mo Claude plan for every employee?
The $200/mo Claude plan is not available for every employee. You can buy the $100/mo plan for up to 150 people, and then you have to switch to API billing.
Max 20x is for individuals only. (could probably have emps get it themselves, and reimburse)
IF they do individual billing the business doesn't get token reporting
> could probably have emps get it themselves, and reimburse
They can’t track token use this way. Also it’s a massive violation of the model providers TOS.
5 replies →
Most startups do this (multiple accounts per employee).
Those plans are going the way of the dinosaur, ai provider loses money on them. Most enterprise offerings are already there, Anthropic changed theirs to $20/seat plus token usage a couple weeks back
I’m curious what FAANG is actually doing per-token billing? I’m guessing not google or amazon (since my wife and I aren’t aware of that).
Compliance
I'm pretty sure with AI there is nothing that complies to anything.
Staring with the fact that the whole industry is based on copyright infringement.
1 reply →
How do you even use that much daily?
I have an unrelated question, please. I am trying to make a post and get this error: "Sorry, your account isn't able to submit this site.", you know why or have a solution for it?
>we have $300/day token quota.
Unless other FAANG have the exact amount this is going to be Apple.
And no wonder why the quality of Apple software has gone downhill.
Apple in software development and design used to be very conservative. BSD like. Especially the lower end of the stack.
Now it is no different to other Silicon Valley companies.
[flagged]
Also at a FAANG here. Surprised you don't manage to use $300 in a whole day. It's almost trivial to productively use that much in under an hour.
Leadership is not being dumb, at least on this topic. If your token usage is that low, you just aren't using AI that much (even if you think you are.)
I use $30 a day to produce a decent amount of code. Certainly more than we need - thinking about/designing the correct solution/distilling requirements is still the bottleneck. How can you possibly even review $300/day worth of output?
It doesn’t have to be $300/day worth of output tokens. It could be like $290/day worth of input tokens to teach both you and the model about the problem you are solving and then $10/day worth of output tokens.
3 replies →
I used Claude about a week ago to do a pretty intensive refactoring. Cleanup, initial modularisation, beginnings of a test suite, and better isolated build. In a span of couple of hours, and over a sequence of 20+ new commits, I burned a hair over $100 in tokens.
If you are working on a seriously large legacy code base, I can see how you'd get to >$250 on a bad day.
If you build your own reviewer layer/tool it will burn a ton of tokens. Millions of tokens of input.
1 reply →
Use expensive models at high effort
Also you regarding Claude usage limits:
> Before the doomers come in, you get $200 in API credits every month for claude -p usage. Usage counts against those API credits.
So which is it $300/day is trivial to consume or $200/month is a completely reasonable limit, it can't be both.
Do you even realize how insane your comment is?
"If you aren't donating at least your salary's worth of company money to another company every day, are you even working?"
Used to think exactly like you. That's why I know you all will "get it" eventually. Most companies and orgs are just so far behind the curve.
12 replies →
Wouldn't they save an enormous amount of money by getting rid of either you and the token quota, or a bunch of other people to continue paying your salary plus this insane quota?
If you are burning through $2400 a day, you’re just wasting tokens on idiotic tasks.
He's rewriting Bun from Rust to Python now.
How are you able to get to $300/hr productively? (I’m assuming this isn’t fast mode tax).
not hard, massive elo stuff. every decision point needs to think up and implement 25 ideas and then rank them.
I am glad I am not on your team, the amount of slop they have to deal with coming from you must be overwhelming
How? I struggle to use the 1000 Kiro tokens I get a month, and that only costs $20. And I use it more then anyone else on my team. Maybe we're just massively behind?
300 an hour, that's insane
Not really, if you use the most expensive models and you have a large codebase stuffed into the context window
You must be using a really bad harness or just writing very vague prompts. 20 Million tokens is a lot.
[flagged]