← Back to context

Comment by cj

2 hours ago

LLMs aren’t human.

Humans & LLMs are more different than they are similar.

Sure LLMs might resemble humans sometimes, but extrapolating LLM behavior based on human behavior is not productive.

(But to answer directly: Yes, children in a dark room would have more of a personality than a LLM living on a computer in the same dark room)

> but extrapolating LLM behavior based on human behavior is not productive.

The training process for the foundation model is to make sure we can do this in a very statistically significant way.

My favorite example is AI "getting tired" and "lazy" during long coding session. Why would they do that? Because humans get tired. It's in the data! I always throw in a periodic "Great work, let's take a break and finish this up on Monday. Have a great weekend!" (And then immediately resume). I wish someone would benchmark this concept.

  • > AI "getting tired" and "lazy" during long coding session. Why would they do that? Because humans get tired.

    When a LLM is tired and lazy, how does it recharge and regain motivation?

    Humans... sleep or drink some coffee.

    LLMs.... idk, you prompt it to try harder? You prompt it to be less tired?

    This is what I mean when I say extrapolating LLM behavior based on human behavior is cute.. but usually not useful.

    • > When a LLM is tired and lazy, how does it recharge and regain motivation?

      What would be in the statistics? If you go look at your long conversations, working with another, it will be fairly obvious. Keep in mind we're talking next word prediction based on context, not actual action (the LLM doesn't need real rest).

      If you went and looked, you'll probably see something like "Great work! Have a good weekend! We can get back to this on Monday." then, next message you instantaneously send something like, "Hope you had a great weekend, let's do this!" and now you're in a latent space where the statistical output is around a well rested human conversing with another.

      I see it as boring simple statistics. They're getting much better at hammering these statistics out though, in the latest models. I still see a little of this in Opus 4.7, when switching to planning. Though I wonder if that's more about its own more mechanical banter filling the context, resulting in more robot/compliant responses, degrading the usually more "expressive" planning conversations.

  • > My favorite example is AI "getting tired" and "lazy" during long coding session

    Never seen this even once, nor anyone I know ever reported this. Do you have an example?

    • First I saw it was Claude Opus 3.7. Had a very long back and fourth about some code, I pointed out an error, and Claude responded "That's what I get for programming at 2am", with the output being filled with "... code here ..." type shortcuts, basically no ability to one-shot a whole implementation anymore. The conversation length WAS reasonably into the 2am range, if it were real. Thought about it, did the statistical trick where I tell it to "have some rest, take a day off!" then immediately follow up with "Ready to continue?", with the next response having no shortcuts, with full implementation, and much better quality. These are trained on human text. This is the human norm, so I always find it interesting when human like behaviors, very broadly present in the statistics, come out like this.

      I also see it a little with Opus 4.7, with Claude Code, with the hint being much more terse planning text, that borderlines unhelpful. I put some "rest" in the context to push the latent space closer to what's in the statistics of the training data: a well rested human.

      3 replies →

    • I see laziness all the time, Claude will be helping me plan work and then it will ask me how a piece of code is implemented. I then have the choice of manually verifying how it works, or to tell it to look for itself. Ideally it would just look without being told.

      2 replies →