Comment by einrealist

10 hours ago

> It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.

Somewhere, there are GPUs/NPUs running hot. You send all the necessary data, including information that you would never otherwise share. And you most likely do not pay the actual costs. It might become cheaper or it might not, because reasoning is a sticking plaster on the accuracy problem. You and your business become dependent on this major gatekeeper. It may seem like a good trade-off today. However, the personal, professional, political and societal issues will become increasingly difficult to overlook.

This quote stuck out to me as well, for a slightly different reason.

The “tenacity” referenced here has been, in my opinion, the key ingredient in the secret sauce of a successful career in tech, at least in these past 20 years. Every industry job has its intricacies, but for every engineer who earned their pay with novel work on a new protocol, framework, or paradigm, there were 10 or more providing value by putting the myriad pieces together, muddling through the ever-waxing complexity, and crucially never saying die.

We all saw others weeded out along the way for lacking the tenacity. Think the boot camp dropouts or undergrads who changed majors when first grappling with recursion (or emacs). The sole trait of stubbornness to “keep going” outweighs analytical ability, leetcode prowess, soft skills like corporate political tact, and everything else.

I can’t tell what this means for the job market. Tenacity may not be enough on its own. But it’s the most valuable quality in an employee in my mind, and Claude has it.

  • There is an old saying back home: an idiot never tires, only sweats.

    Claude isn't tenacious. It is an idiot that never stops digging because it lacks the meta cognition to ask 'hey, is there a better way to do this?'. Chain of thought's whole raison d'etre was so the model could get out of the local minima it pushed itself in. The issue is that after a year it still falls into slightly deeper local minima.

    This is fine when a human is in the loop. It isn't what you want when you have a thousand idiots each doing a depth first search on what the limit of your credit card is.

    • > it lacks the meta cognition to ask 'hey, is there a better way to do this?'.

      Recently had an AI tell me this code (that it wrote) is a mess and suggested wiping it and starting from scratch with a more structure plan. That seems to hint at some meta cognition outlines

      7 replies →

    • I mean, not always. I've seen Claude step back and reconsider things after hitting a dead end, and go down a different path. There are also workflows, loops that can increase the likelihood of this occurring.

  • This is a major concern for junior programmers. For many senior ones, after 20 (or even 10) years of tenacious work, they realize that such work will always be there, and they long ago stopped growing on that front (i.e. they had already peaked). For those folks, LLMs are a life saver.

    At a company I worked for, lots of senior engineers become managers because they no longer want to obsess over whether their algorithm has an off by one error. I think fewer will go the management route.

    (There was always the senior tech lead path, but there are far more roles for management than tech lead).

    • I feel like if you're really spending a ton of time on off by one errors after twenty years in the field you haven't actually grown much and have probably just spent a ton of time in a single space.

      Otherwise you'd be senior staff to principle range and doing architecture, mentorship, coordinating cross team work, interviewing, evaluating technical decisions, etc.

      I got to code this week a bit and it's been a tremendous joy! I see many peers at similar and lower levels (and higher) who have more years and less technical experience and still write lots of code and I suspect that is more what you're talking about. In that case, it's not so much that you've peaked, it's that there's not much to learn and you're doing a bunch of the same shit over and over and that's of course tiring.

      I think it also means that everything you interact with outside your space does feel much harder because of the infrequency with which you have interacted with it.

      If you've spent your whole career working the whole stack from interfaces to infrastructure then there's really not going to be much that hits you as unfamiliar after a point. Most frameworks recycle the same concepts and abstractions, same thing with programming languages, algorithms, data management etc.

      But if you've spent most of your career in one space cranking tickets, those unknown corners are going to be as numerous as the day you started and be much more taxing.

    • Imagine a senior dev who just approves PRs, approves production releases, and prioritizes bug reports and feature requests. LLM watches for errors ceaslessly, reports an issue. Senior dev reviews the issue and assigns a severity to it. Another LLM has a backlog of features and errors to go solve, it makes a fix and submits a PR after running tests and verifying things work on its end.

  • Why are we pretending like the need for tenacity will go away? Certain problems are easier now. We can tackle larger problems now that also require tenacity.

    • Even right at this very moment where we have a high-tenacity AI, I'd argue that working with the AI -- that is to say, doing AI coding itself and dealing with the novel challenges that brings requires a lot of stubborn persistence.

I still find in these instances there's at least a 50% chance it has taken a shortcut somewhere: created a new, bigger bug in something that just happened not to have a unit test covering it, or broke an "implicit" requirement that was so obvious to any reasonable human that nobody thought to document it. These can be subtle because you're not looking for them, because no human would ever think to do such a thing.

Then even if you do catch it, AI: "ah, now I see exactly the problem. just insert a few more coins and I'll fix it for real this time, I promise!"

  • The value extortion plan writes itself. How long before someone pitches the idea that the models explicitly almost keep solving your problem to get you to keep spending? Would you even know?

    • First time I've seen this idea, I have a tingling feeling it might become reality sooner rather than later.

    • That’s far-fetched. It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.

      4 replies →

    • I was thinking more of deliberate backdoor in code. RCE is an obvious example, but another one could be bias. "I'm sorry ma'am, computer says you are ineligable for a bank account." These ideas aren't new. They were there in 90s already when we still thought about privacy and accountability regarding technology, and dystopian novels already described them long, long ago.

    • The free market proposition is that competition (especially with Chinese labs and grok) means that Anthropic is welcome to do that. They're even welcome to illegally collude with OpenAi such that ChatGPT is similarly gimped. But switching costs are pretty low. If it turns out I can one shot an issue with Qwen or Deepseek or Kimi thinking, Anthropic loses not just my monthly subscription, but everyone else's I show that too. So no, I think that's some grade A conspiracy theory nonsense you've got there.

      10 replies →

  • > These can be subtle because you're not looking for them

    After any agent run, I'm always looking the git comparison between the new version and the previous one. This helps catch things that you might otherwise not notice.

    • And after manually coding I often have an LLM review the diff. 90% of the problems it finds can be discounted, but it’s still a net positive.

  • You are using it wrong, or are using a weak model if your failure rate is over 50%. My experience is nothing like this. It very consistently works for me. Maybe there is a <5% chance it takes the wrong approach, but you can quickly steer it in the right direction.

> It might become cheaper or it might not

If it does not, this is going to be first technology in the history of mankind that has not become cheaper.

(But anyway, it already costs half compared to last year)

  • > But anyway, it already costs half compared to last year

    You could not have bought Claude Opus 4.5 at any price one year ago I'm quite certain. The things that were available cost half of what they did then, and there are new things available. These are both true.

    I'm agreeing with you, to be clear.

    There are two pieces I expect to continue: inference for existing models will continue to get cheaper. Models will continue to get better.

    Three things, actually.

    The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].

    [0]: https://blog.irvingwb.com/blog/2018/09/a-critical-appraisal-...

    • As a user of LLMs since GPT-3 there was noticeable stagnation in LLM utility after the release of GPT-4. But it seems the RLHF, tool calling, and UI have all come together in the last 12 months. I used to wonder what fools could be finding them so useful to claim a 10x multiplier - even as a user myself. These days I’m feeling more and more efficiency gains with Claude Code.

    • > The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].

      Everybody who bet against Moore's Law was wrong ... until they weren't.

      And AI is the reaction to Moore's Law having broken. Nobody gave one iota of damn about trying to make programming easier until the chips couldn't double in speed anymore.

      3 replies →

  • That's not true. Many technologies get more expensive over time, as labor gets more expensive or as certain skills fall by the wayside, not everything is mass market. Have you tried getting a grandfather clock repaired lately?

    • Repairing grandfather clocks isn't more expensive now because it's gotten any harder; it's because the popularity of grandfather clocks is basically nonexistent compared to anything else to tell time.

    • "repairing a unique clock" getting costlier doesn't mean technology hasn't gotten cheaper.

      check out whether clocks have gotten cheaper in general. the answer is that it has.

      there is no economy of scale here in repairing a single clock. its not relevant to bring it up here.

      8 replies →

    • Instead of advancing tenuous examples you could suggest a realistic mechanism by which costs could rise, such as a Chinese advance on Taiwan, effecting TSMC, etc.

    • Time-keeping is vastly cheaper. People don't want grandfather clocks. They want to tell time. And they can, more accurately, more easily, and much cheaper than their ancestors.

    • No. You don't get to make "technology gets more expensive over time" statements for deprecated technologies.

      Getting a bespoke flintstone axe is also pretty expensive, and has also absolutely no relevance to modern life.

      These discussions must, if they are to be useful, center in a population experience, not in unique personal moments.

      9 replies →

  • Not true. Bitcoin has continued to rise in cost since its introduction (as in the aggregate cost incurred to run the network).

    LLMs will face their own challenges with respect to reducing costs, since self-attention grows quadratically. These are still early days, so there remains a lot of low hanging fruit in terms of optimizations, but all of that becomes negligible in the face of quadratic attention.

  • There are plenty of technologies that have not become cheaper, or at least not cheap enough, to go big and change the world. You probably haven't heard of them because obviously they didn't succeed.

Humans run hot too. Once you factor in the supply chain that keeps us alive, things become surprisingly equivalent.

Eating burgers and driving cars around costs a lot more than whatever # of watts the human brain consumes.

> And you most likely do not pay the actual costs.

This is one of the weakest anti AI postures. "It's a bubble and when free VC money stops you'll be left with nothing". Like it's some kind of mystery how expensive these models are to run.

You have open weight models right now like Kimi K2.5 and GLM 4.7. These are very strong models, only months behind the top labs. And they are not very expensive to run at scale. You can do the math. In fact there are third parties serving these models for profit.

The money pit is training these models (and not that much if you are efficient like chinese models). Once they are trained, they are served with large profit margins compared to the inference cost.

OpenAI and Anthropic are without a doubt selling their API for a lot more than the cost of running the model.

> Somewhere, there are GPUs/NPUs running hot.

Running at their designed temperature.

> You send all the necessary data, including information that you would never otherwise share.

I've never sent the type of data that isn't already either stored by GitHub or a cloud provider, so no difference there.

> And you most likely do not pay the actual costs.

So? Even if costs double once investor subsidies stop, that doesn't change much of anything. And the entire history of computing is that things tend to get cheaper.

> You and your business become dependent on this major gatekeeper.

Not really. Switching between Claude and Gemini or whatever new competition shows up is pretty easy. I'm no more dependent on it than I am on any of another hundred business services or providers that similarly mostly also have competitors.

To me this tenacity is often like watching someone trying to get a screw into board using a hammer.

There’s often a better faster way to do it, and while it might get to the short term goal eventually, it’s often created some long term problems along the way.

It is also amazing seeing Linux kernel work, scheduling threads, proving interrupts and API calls all without breaking a sweat or injuring its ACL.

With optimizations and new hardware, power is almost a negligible cost. You can get 5.5M tokens/s/MW[1] for kimi k2(=20M/KWH=181M tokens/$) which is 400x cheaper than current pricing. It's just Nvidia/TSMC/other manufacturers eating up the profit now because they can. My bet is that China will match current Nvidia within 5 years.

[1]: https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...

  • Electricity is negligible but the dominant cost is the hardware depreciation itself. Also inference is typically memory bandwidth bound so you are limited by how fast you can move weights rather than raw compute efficiency.

    • Yes, because the hardware cost margin is like 80% for Nvidia, and 80% again for the manufacturers like Samsung and TSMC. Once the fixed cost like R and D is amortized the same node technology and hardware capacity could be just few single digit percent of current.