Comment by denkmoon

25 days ago

I use $30 a day to produce a decent amount of code. Certainly more than we need - thinking about/designing the correct solution/distilling requirements is still the bottleneck. How can you possibly even review $300/day worth of output?

It doesn’t have to be $300/day worth of output tokens. It could be like $290/day worth of input tokens to teach both you and the model about the problem you are solving and then $10/day worth of output tokens.

  • And what about you knowing the problem and the solution, but are just worrying about the impact downstream. Most of my time is spent managing those. I know the exact code to be published. And some time I already have it committed in my local branch. Then you need to make everyone aware of what it entails and that's usually how you can spend days on a simple bug or a change request.

    Software is a big graph of interlocked rules. And if you can grasp the whole or the part you own (and you should be able to), it's often very easy to see the control points. You don't have a coding bottleneck anymore, you have a communication bottleneck[0]. Which is an organizational issue, not anything relevant to engineering.

    [0]: See Naur's Programming as Theory Building and Brooke's Mythical Man Month.

  • If you give it $290 of input tokens for $10 of output tokens, you are doing something wrong. I.e. you paste the whole CI output into the prompt instead of giving it a link to the file, and then the AI greps its way through it (using a fraction of the tokens).

    Sometimes AI overdoes things and it re-runs the whole testsuite because the tail command didn't have enough lines, but the other way round messes up the context so much so that in the end all that context is useless.

I used Claude about a week ago to do a pretty intensive refactoring. Cleanup, initial modularisation, beginnings of a test suite, and better isolated build. In a span of couple of hours, and over a sequence of 20+ new commits, I burned a hair over $100 in tokens.

If you are working on a seriously large legacy code base, I can see how you'd get to >$250 on a bad day.

If you build your own reviewer layer/tool it will burn a ton of tokens. Millions of tokens of input.

  • You, review bots and first pass bots can chew through tokens. Also if you haven't put effort into your harnesses, the agent will have to spend more time and tokens figuring things out again and again