Comment by einrealist

13 days ago

> It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.

Somewhere, there are GPUs/NPUs running hot. You send all the necessary data, including information that you would never otherwise share. And you most likely do not pay the actual costs. It might become cheaper or it might not, because reasoning is a sticking plaster on the accuracy problem. You and your business become dependent on this major gatekeeper. It may seem like a good trade-off today. However, the personal, professional, political and societal issues will become increasingly difficult to overlook.

171 comments

einrealist

cyode 13 days ago

This quote stuck out to me as well, for a slightly different reason.

The “tenacity” referenced here has been, in my opinion, the key ingredient in the secret sauce of a successful career in tech, at least in these past 20 years. Every industry job has its intricacies, but for every engineer who earned their pay with novel work on a new protocol, framework, or paradigm, there were 10 or more providing value by putting the myriad pieces together, muddling through the ever-waxing complexity, and crucially never saying die.

We all saw others weeded out along the way for lacking the tenacity. Think the boot camp dropouts or undergrads who changed majors when first grappling with recursion (or emacs). The sole trait of stubbornness to “keep going” outweighs analytical ability, leetcode prowess, soft skills like corporate political tact, and everything else.

I can’t tell what this means for the job market. Tenacity may not be enough on its own. But it’s the most valuable quality in an employee in my mind, and Claude has it.

noosphr 12 days ago
There is an old saying back home: an idiot never tires, only sweats.
Claude isn't tenacious. It is an idiot that never stops digging because it lacks the meta cognition to ask 'hey, is there a better way to do this?'. Chain of thought's whole raison d'etre was so the model could get out of the local minima it pushed itself in. The issue is that after a year it still falls into slightly deeper local minima.
This is fine when a human is in the loop. It isn't what you want when you have a thousand idiots each doing a depth first search on what the limit of your credit card is.
- Havoc 12 days ago
  
  > it lacks the meta cognition to ask 'hey, is there a better way to do this?'.
  Recently had an AI tell me this code (that it wrote) is a mess and suggested wiping it and starting from scratch with a more structure plan. That seems to hint at some meta cognition outlines
  
  14 replies →
- samusiam 12 days ago
  
  I mean, not always. I've seen Claude step back and reconsider things after hitting a dead end, and go down a different path. There are also workflows, loops that can increase the likelihood of this occurring.
- cocacolacowboy 12 days ago
  
  [dead]
BeetleB 13 days ago
This is a major concern for junior programmers. For many senior ones, after 20 (or even 10) years of tenacious work, they realize that such work will always be there, and they long ago stopped growing on that front (i.e. they had already peaked). For those folks, LLMs are a life saver.
At a company I worked for, lots of senior engineers become managers because they no longer want to obsess over whether their algorithm has an off by one error. I think fewer will go the management route.
(There was always the senior tech lead path, but there are far more roles for management than tech lead).
- codyb 12 days ago
  
  I feel like if you're really spending a ton of time on off by one errors after twenty years in the field you haven't actually grown much and have probably just spent a ton of time in a single space.
  Otherwise you'd be senior staff to principle range and doing architecture, mentorship, coordinating cross team work, interviewing, evaluating technical decisions, etc.
  I got to code this week a bit and it's been a tremendous joy! I see many peers at similar and lower levels (and higher) who have more years and less technical experience and still write lots of code and I suspect that is more what you're talking about. In that case, it's not so much that you've peaked, it's that there's not much to learn and you're doing a bunch of the same shit over and over and that's of course tiring.
  I think it also means that everything you interact with outside your space does feel much harder because of the infrequency with which you have interacted with it.
  If you've spent your whole career working the whole stack from interfaces to infrastructure then there's really not going to be much that hits you as unfamiliar after a point. Most frameworks recycle the same concepts and abstractions, same thing with programming languages, algorithms, data management etc.
  But if you've spent most of your career in one space cranking tickets, those unknown corners are going to be as numerous as the day you started and be much more taxing.
- rishabhaiover 12 days ago
  
  That's just sad. Right when I found love in what I do, my work has no value anymore.
  
  9 replies →
- test6554 12 days ago
  
  Imagine a senior dev who just approves PRs, approves production releases, and prioritizes bug reports and feature requests. LLM watches for errors ceaslessly, reports an issue. Senior dev reviews the issue and assigns a severity to it. Another LLM has a backlog of features and errors to go solve, it makes a fix and submits a PR after running tests and verifying things work on its end.
techgnosis 12 days ago
Why are we pretending like the need for tenacity will go away? Certain problems are easier now. We can tackle larger problems now that also require tenacity.
- samusiam 12 days ago
  
  Even right at this very moment where we have a high-tenacity AI, I'd argue that working with the AI -- that is to say, doing AI coding itself and dealing with the novel challenges that brings requires a lot of stubborn persistence.
mykowebhn 12 days ago
Fittingly, George Hinton toiled away for years in relative obscurity before finally being recognized for his work. I was always quite impressed by his "tenacity".
So although I don't think he should have won the Nobel Prize because not really physics, I felt his perseverance and hard work should merit something.
- direwolf20 12 days ago
  
  ... The person who embezzled from the SDC in 2018? https://eu.jsonline.com/story/news/investigations/2024/04/19...
  
  1 reply →

daxfohl 13 days ago

I still find in these instances there's at least a 50% chance it has taken a shortcut somewhere: created a new, bigger bug in something that just happened not to have a unit test covering it, or broke an "implicit" requirement that was so obvious to any reasonable human that nobody thought to document it. These can be subtle because you're not looking for them, because no human would ever think to do such a thing.

Then even if you do catch it, AI: "ah, now I see exactly the problem. just insert a few more coins and I'll fix it for real this time, I promise!"

gtowey 13 days ago
The value extortion plan writes itself. How long before someone pitches the idea that the models explicitly almost keep solving your problem to get you to keep spending? Would you even know?
- password4321 12 days ago
  
  First time I've seen this idea, I have a tingling feeling it might become reality sooner rather than later.
- sailfast 13 days ago
  
  That’s far-fetched. It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.
  
  12 replies →
- fragmede 13 days ago
  
  The free market proposition is that competition (especially with Chinese labs and grok) means that Anthropic is welcome to do that. They're even welcome to illegally collude with OpenAi such that ChatGPT is similarly gimped. But switching costs are pretty low. If it turns out I can one shot an issue with Qwen or Deepseek or Kimi thinking, Anthropic loses not just my monthly subscription, but everyone else's I show that too. So no, I think that's some grade A conspiracy theory nonsense you've got there.
  
  13 replies →
- Fnoord 12 days ago
  
  I was thinking more of deliberate backdoor in code. RCE is an obvious example, but another one could be bias. "I'm sorry ma'am, computer says you are ineligable for a bank account." These ideas aren't new. They were there in 90s already when we still thought about privacy and accountability regarding technology, and dystopian novels already described them long, long ago.
- chanux 12 days ago
  
  Is this from a page of dating apps playbook?
  
  1 reply →
wvenable 13 days ago
> These can be subtle because you're not looking for them
After any agent run, I'm always looking the git comparison between the new version and the previous one. This helps catch things that you might otherwise not notice.
- teaearlgraycold 12 days ago
  
  And after manually coding I often have an LLM review the diff. 90% of the problems it finds can be discounted, but it’s still a net positive.
einrealist 12 days ago

And there is this paradox where it becomes harder to detect the problems as the models 'improve'.
charcircuit 13 days ago
You are using it wrong, or are using a weak model if your failure rate is over 50%. My experience is nothing like this. It very consistently works for me. Maybe there is a <5% chance it takes the wrong approach, but you can quickly steer it in the right direction.
- testaccount28 13 days ago
  
  you are using it on easy questions. some of us are not.
  
  14 replies →

fooker 13 days ago

> It might become cheaper or it might not

If it does not, this is going to be first technology in the history of mankind that has not become cheaper.

(But anyway, it already costs half compared to last year)

ctoth 13 days ago
> But anyway, it already costs half compared to last year
You could not have bought Claude Opus 4.5 at any price one year ago I'm quite certain. The things that were available cost half of what they did then, and there are new things available. These are both true.
I'm agreeing with you, to be clear.
There are two pieces I expect to continue: inference for existing models will continue to get cheaper. Models will continue to get better.
Three things, actually.
The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].
[0]: https://blog.irvingwb.com/blog/2018/09/a-critical-appraisal-...
- teaearlgraycold 12 days ago
  
  As a user of LLMs since GPT-3 there was noticeable stagnation in LLM utility after the release of GPT-4. But it seems the RLHF, tool calling, and UI have all come together in the last 12 months. I used to wonder what fools could be finding them so useful to claim a 10x multiplier - even as a user myself. These days I’m feeling more and more efficiency gains with Claude Code.
  
  2 replies →
- bsder 13 days ago
  
  > The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].
  Everybody who bet against Moore's Law was wrong ... until they weren't.
  And AI is the reaction to Moore's Law having broken. Nobody gave one iota of damn about trying to make programming easier until the chips couldn't double in speed anymore.
  
  3 replies →
- simianwords 13 days ago
  
  interesting post. i wonder if these people go back and introspect on how incorrect they have been? do they feel the need to address it?
  
  5 replies →
peaseagee 13 days ago
That's not true. Many technologies get more expensive over time, as labor gets more expensive or as certain skills fall by the wayside, not everything is mass market. Have you tried getting a grandfather clock repaired lately?
- willio58 13 days ago
  
  Repairing grandfather clocks isn't more expensive now because it's gotten any harder; it's because the popularity of grandfather clocks is basically nonexistent compared to anything else to tell time.
  
  1 reply →
- simianwords 13 days ago
  
  "repairing a unique clock" getting costlier doesn't mean technology hasn't gotten cheaper.
  check out whether clocks have gotten cheaper in general. the answer is that it has.
  there is no economy of scale here in repairing a single clock. its not relevant to bring it up here.
  
  9 replies →
- esafak 13 days ago
  
  Instead of advancing tenuous examples you could suggest a realistic mechanism by which costs could rise, such as a Chinese advance on Taiwan, effecting TSMC, etc.
- groby_b 13 days ago
  
  No. You don't get to make "technology gets more expensive over time" statements for deprecated technologies.
  Getting a bespoke flintstone axe is also pretty expensive, and has also absolutely no relevance to modern life.
  These discussions must, if they are to be useful, center in a population experience, not in unique personal moments.
  
  11 replies →
- emtel 13 days ago
  
  Time-keeping is vastly cheaper. People don't want grandfather clocks. They want to tell time. And they can, more accurately, more easily, and much cheaper than their ancestors.
- epidemiology 12 days ago
  
  Or riding in an uber?
InsideOutSanta 13 days ago
Sure, running an LLM is cheaper, but the way we use LLMs now requires way more tokens than last year.
- fooker 13 days ago
  
  10x more tokens today cost less than than half of X tokens from ~mid 2024.
- simianwords 13 days ago
  
  ok but the capabilities are also rising. what point are you trying to make?
  
  11 replies →
root_axis 13 days ago
Not true. Bitcoin has continued to rise in cost since its introduction (as in the aggregate cost incurred to run the network).
LLMs will face their own challenges with respect to reducing costs, since self-attention grows quadratically. These are still early days, so there remains a lot of low hanging fruit in terms of optimizations, but all of that becomes negligible in the face of quadratic attention.
- twoodfin 12 days ago
  
  For Bitcoin that’s by design!
- namcheapisdumb 12 days ago
  
  > bitcoin
  so close! that is a commodity
fulafel 12 days ago

I don't think computation is going to become more expensive, but there are techs that have become so: Nuclear power plants. Mobile phones. Oil extraction.
(Oil rampdown is a survival imperative due to the climate catastrophe so there it's a very positive thing of course, though not sufficient...)
krupan 13 days ago

There are plenty of technologies that have not become cheaper, or at least not cheap enough, to go big and change the world. You probably haven't heard of them because obviously they didn't succeed.
asadotzler 13 days ago

cheaper doesnt mean cheap enough to be viable after the bills come due
runarberg 12 days ago

Supersonic jet engines, rockets to the moon, nuclear power plants, etc. etc. all have become more expensive. Superconductors were discovered in 1911, and we have been making them for as long as we have been making transistors in the 1950s, yet superconductors show no sign of becoming cheaper any time soon.
There have been plenty of technologies in history which do not in fact become cheaper. LLMs are very likely to become such, as I suspect their usefulness will be superseded by cheaper (much cheaper in fact) specialized models.
ak_111 13 days ago

Concorde?

YetAnotherNick 13 days ago

With optimizations and new hardware, power is almost a negligible cost. You can get 5.5M tokens/s/MW[1] for kimi k2(=20M/KWH=181M tokens/$) which is 400x cheaper than current pricing. It's just Nvidia/TSMC/other manufacturers eating up the profit now because they can. My bet is that China will match current Nvidia within 5 years.

[1]: https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...

storystarling 13 days ago
Electricity is negligible but the dominant cost is the hardware depreciation itself. Also inference is typically memory bandwidth bound so you are limited by how fast you can move weights rather than raw compute efficiency.
- YetAnotherNick 12 days ago
  
  Yes, because the margin is like 80% for Nvidia, and 80% again for the manufacturers like Samsung and TSMC. Once the fixed cost like R and D is amortized the same node technology and hardware capacity could be just few single digit percent of current.

redox99 13 days ago

> And you most likely do not pay the actual costs.

This is one of the weakest anti AI postures. "It's a bubble and when free VC money stops you'll be left with nothing". Like it's some kind of mystery how expensive these models are to run.

You have open weight models right now like Kimi K2.5 and GLM 4.7. These are very strong models, only months behind the top labs. And they are not very expensive to run at scale. You can do the math. In fact there are third parties serving these models for profit.

The money pit is training these models (and not that much if you are efficient like chinese models). Once they are trained, they are served with large profit margins compared to the inference cost.

OpenAI and Anthropic are without a doubt selling their API for a lot more than the cost of running the model.

bob1029 12 days ago

Humans run hot too. Once you factor in the supply chain that keeps us alive, things become surprisingly equivalent.

Eating burgers and driving cars around costs a lot more than whatever # of watts the human brain consumes.

bbor 12 days ago
I mean, “equivalent” is an understatement! There’s a reason Claude Code costs less than hiring a full time software engineer…
- direwolf20 12 days ago
  
  (it's VC money burn)

crazygringo 12 days ago

> Somewhere, there are GPUs/NPUs running hot.

Running at their designed temperature.

> You send all the necessary data, including information that you would never otherwise share.

I've never sent the type of data that isn't already either stored by GitHub or a cloud provider, so no difference there.

> And you most likely do not pay the actual costs.

So? Even if costs double once investor subsidies stop, that doesn't change much of anything. And the entire history of computing is that things tend to get cheaper.

> You and your business become dependent on this major gatekeeper.

Not really. Switching between Claude and Gemini or whatever new competition shows up is pretty easy. I'm no more dependent on it than I am on any of another hundred business services or providers that similarly mostly also have competitors.

hahahahhaah 13 days ago

It is also amazing seeing Linux kernel work, scheduling threads, proving interrupts and API calls all without breaking a sweat or injuring its ACL.

mikeocool 13 days ago

To me this tenacity is often like watching someone trying to get a screw into board using a hammer.

There’s often a better faster way to do it, and while it might get to the short term goal eventually, it’s often created some long term problems along the way.

chasebank 12 days ago

I don’t understand this pov. Unfortunately, id pay 10k mo for my cc sub. I wish I could invest in anthropic, they’re going to be the most profitable company on earth

moooo99 12 days ago

My agent struggled for 45 minutes because it tried to do `go run` on a _test.go file, which the compiler repeatedly exited after posting an error message that files named like this cannot be executed using the run command.

So yeah, that wasted a lot of GPU cycles for a very unimpressive result, but with a renewed superficial feeling of competence

squidbeak 12 days ago

> you most likely do not pay the actual costs. It might become cheaper or it might not

Why would this be the first technology that doesn't become cheaper at scale over time?

karlgkk 12 days ago

> And you most likely do not pay the actual costs

Oh my lord you absolutely do not. The costs to oai per token inference ALONE are at least 7x. AT LEAST and from what I’ve heard, much higher.

tgrowazay 12 days ago
We can observe how much generic inference providers like deepinfra or together-ai charge for large SOTA models. Since they are not subsidized and they don’t charge 7x of OpenAI, that means OAI also doesn’t have outrageously high per-token costs.
- karlgkk 3 days ago
  
  Actually, that doesn’t mean anything.
  OAI is running boundary pushing large models. I don’t think those “second tier” applications can even get the GPUs with the HBM required at any reasonable scale for customer use.
  Not to mention training costs of foundation models

utopiah 12 days ago

AI genius discover brute forcing... what a time to be alive. /s

Like... bro that's THE foundation of CS. That's the principle of The bomb in Turing's time. One can still marvel at it but it's been with us since the beginning.