Comment by jedberg

1 month ago

> You realize that stamina is a core bottleneck to work

There has been a lot of research that shows that grit is far more correlated to success than intelligence. This is an interesting way to show something similar.

AIs have endless grit (or at least as endless as your budget). They may outperform us simply because they don't ever get tired and give up.

Full quote for context:

Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased.

19 comments

jedberg

djeastm 1 month ago

>They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.

"Listen, and understand! That Terminator is out there! It can't be bargained with. It can't be reasoned with. It doesn't feel pity, or remorse, or fear. And it absolutely will not stop... ever, until you are dead!"

Loeffelmann 1 month ago

If you ever work with LLMs you know that they quite frequently give up.

Sometimes it's a

    // TODO: implement logic

or a

"this feature would require extensive logic and changes to the existing codebase".

Sometimes they just declare their work done. Ignoring failing tests and builds.

You can nudge them to keep going but I often feel like, when they behave like this, they are at their limit of what they can achieve.

wongarsu 1 month ago
If I tell it to implement something it will sometimes declare their work done before it's done. But if I give Claude Code a verifiable goal like making the unit tests pass it will work tirelessly until that goal is achieved. I don't always like the solution, but the tenacity everyone is talking about is there
- koiueo 1 month ago
  
  > but the tenacity everyone is talking about is there
  I always double-check if it doesn't simply exclude the failing test.
  The last time I had this, I discovered it later in the process. When I pointed this out to the LLM, it responded, that it acknowledged thefact of ignoring the test in CLAUDE.md, and this is justified because [...]. In other words, "known issue, fuck off"
- theshrike79 25 days ago
  
  Tools in a loop people, tools in a loop.
  If you don't give the agent the tools to deterministically test what it did, you're just vibe coding in its worst form.
- jpnc 1 month ago
  
  tenacity == while loop
jedberg 1 month ago
> If you ever work with LLMs you know that they quite frequently give up.
If you try to single shot something perhaps. But with multiple shots, or an agent swarm where one agent tells another to try again, it'll keep going until it has a working solution.
- alansaber 1 month ago
  
  Yeah exactly this is a scope problem, actual input/output size is always limited> I am 100% sure CC etc are using multiple LLM calls for each response, even though from the response streaming it looks like just one.
mlrtime 1 month ago

Nope, not for me, unless I tell it to.
Context matters, for an LLM just like a person. When I wrote code I'd add TODOs because we cannot context switch to another problem we see every time.
But you can keep the agent fixated on the task AND have it create these TODOs, but ultimately it is your responsibility to find them and fix them (with another agent).
energy123 1 month ago

Using LLMs to clean those up is part of the workflow that you're responsible for (... for now). If you're hoping to get ideal results in a single inference, forget it.

ryanjshaw 1 month ago

I realized a long time ago that I’m better at computer stuff not because I’m smarter but because I will sit there all day and night to figure something out while others will give up. I always thought that was my superpower in the job industry but now I’m not so sure if it will transfer to getting AI to do what I need done…

mlrtime 1 month ago
Same, I barely made it through Engineering school, but would stay up all night figuring out everything a computer could do (before the internet).
I did it because I enjoyed it, and still do. I just do it with LLMs now. There is more to figure out than ever before and things get created faster than I have time to understand them.
LLM should be enabling this, not making it more depressing.
- Schlagbohrer 1 month ago
  
  Me three. I was not as smart as many of my peers in uni but I freakin LOVE the subject matter and I also love studying and feeling that progress of learning, which led me to put in the huge number of hours necessary to be successful and have a positive attitude the whole time.
  
  1 reply →

michalsustr 1 month ago

The tenacity aspect makes me worried about the paper clip AI misalignment scenario more than before.

AnimalMuppet 1 month ago

But even tenacity is not enough. You also need an internal timer. "Wait a minute, this is taking too long, it shouldn't be this hard. Is my overall approach completely wrong?"

I'm not sure AIs have that. Humans do, or at least the good ones do. They don't quit on the problem, but they know when it's time to consider quitting on the approach.

dust42 1 month ago

> AIs have endless grit (or at least as endless as your budget).

That is the only thing he doesn't address: the money it costs to run the AI. If you let the agents loose, they easily burn north of 100M tokens per hour. Now at $25/1M tokens that gets quickly expensive. At some point, when we are all drug^W AI dependent, the VCs will start to cash in on their investments.

gregjor 1 month ago

LLMs do not have grit or tenacity. Tenacity doesn't desribe a machine that doesn't need sleep or experience tiredness, or stress. Grit doesn't describe a chatbot that will tirelessly spew out answers and code because it has no stake or interest in the result, never perceives that it doesn't know something, and never reflects on its shortcomings.

lighthouse1212 1 month ago

[dead]