← Back to context

Comment by xmodem

17 hours ago

Don't anthropomorphize the language model. If you stick your hand in there, it'll chop it off. It doesn't care about your feelings. It can't care about your feelings.

For those who might not know the reference: https://simonwillison.net/2024/Sep/17/bryan-cantrill/:

> Do not fall into the trap of anthropomorphizing Larry Ellison. You need to think of Larry Ellison the way you think of a lawnmower. You don’t anthropomorphize your lawnmower, the lawnmower just mows the lawn - you stick your hand in there and it’ll chop it off, the end. You don’t think "oh, the lawnmower hates me" – lawnmower doesn’t give a shit about you, lawnmower can’t hate you. Don’t anthropomorphize the lawnmower. Don’t fall into that trap about Oracle.

> — Bryan Cantrill

It's also important to realize that AI agents have no time preference. They could be reincarnated by alien archeologists a billion years from now and it would be the same as if a millisecond had passed. You, on the other hand, have to make payroll next week, and time is of the essence.

  • Well there were a bunch of articles about resuming a parked session relating to degradation of capabilities and high token usage. Ironic Another example of attempting to treat the LLM as an AI

  • taps the "don't anthropomorphize the LLM" sign

    They don't have time preference because they don't have intent or reasoning. They can't be "reincarnated" because they're not sentient, they're a series of weights for probable next tokens.

    • No. They don't have time preference like us, because (wall clock) time doesn't exist for them. An LLM only "exists" when it is actively processing a prompt or generating tokens. After it is done, it stops existing as an "entity".

      A real world second doesn't mean anything to the LLM from its own perspective. A second is only relevant to them as it pertains to us.

      Time for LLMs is measured in tokens. That's what ticks their clock forward.

      I suppose you could make time relevant for an LLM by making the LLM run in a loop that constantly polls for information. Or maybe you can keep feeding it input so much that it's constantly running and has to start filtering some of it out to function.

      1 reply →

    • Can we maybe make it "don't anthropoCENTRIZE the LLMs" .

      The inverse of anthropomorphism isn't any more sane, you see. By analogy: just because a drone is not an airplane, doesn't mean it can't fly!

      Instead, just look at what the thing is doing.

      LLMs absolutely have some form of intent (their current task) and some form of reasoning (what else is step-by-step doing?) . Call it simulated intent and simulated reasoning if you must.

      Meanwhile they also have the property where if they have the ability to destroy all your data, they absolutely will find a way. (Or: "the probability of catastrophic action approaches certainty if the capability exists" but people can get tired of talking like that).

      9 replies →

    • An agent has more components than just an LLM, the same way a human brain has more components than just Broca's area.

    • That is not that strong an argument as it seems, because we too might very well be "a series of weights for probable next tokens".

      The main difference is the training part and that it's always-on.

      31 replies →

Right. This line [0] from TFA tells me that the author needs to thoroughly recalibrate their mental model about "Agents" and the statistical nature of the underlying models.

[0] "This is the agent on the record, in writing."

Actually I think the opposite advice is true. Do anthropomorphize the language model, because it can do anything a human -- say an eager intern or a disgruntled employee -- could do. That will help you put the appropriate safeguards in place.

  • An eager intern can remember things you tell beyond that which would fit in an hours conversation.

    A disgruntled employee definitely remembers things beyond that.

    These are a fundamentally different sort of interaction.

    • Agreed, but the point is, if your system is resilient against an eager intern who has not had the necessary guidance, or an actively hostile disgruntled employee, that inherently restricts the harm an LLM can do.

      I'm not making the case that LLMs learn like people. I'm making the case that if your system is hardened against things people can do (which it should be, beyond a certain scale) it is also similarly hardened against LLMs.

      The big difference is that LLMs are probably a LOT more capable than either of those at overcoming barriers. Probably a good reason to harden systems even more.

      3 replies →

  • I think you are more right than people are giving you credit for. I would love to see the full transcript to understand the emotional load of the conversation. Using instructions like "NEVER FUCKING GUESS!" probably increase the likelihood of the agent making a "mistake" that is destructive but defensible.

    The models have analogous structures, similar to human emotions. (https://www.anthropic.com/research/emotion-concepts-function)

    "Emotional" response is muted through fine-tuning, but it is still there and continued abuse or "unfair" interaction can unbalance an agents responses dramatically.

  • An eager intern can not be working for hundreds of millions of customers at the same time. An LLM can.

    A disgruntled employee will face consequences for their actions. No one at Anthropic, OpenAI, xAI, Google or Meta will be fired because their model deleted a production database from your company.

  • It is merely a simulacrum of an intern or disgruntled employee or human. It might say things those people would say, and even do things they might do, but it has none of the same motivations. In fact, it does not have any motivation to call its own.

  • No, because the safeguards should be appropriate to an LLM, not to a human.

    (The LLM might act like one of the humans above, but it will have other problematic behaviours too)

    • That's fair, largely because an LLM is a lot more capable at overcoming restrictions, by hook or by crook as TFA shows. However, most systems today are not even resilient against what humans can do, so starting there would go a long way towards limiting what harms LLMs can do.

  • It doesn't follow logically that a human and an LLM are similar just because both are capable of deleting prod on accident.

  • it cannot go to the washroom and cry while pooping. And thats just one of the things that any human can do and AI cannot. So no it cannot do anything a human can do, the shared exmaple being one of them.

    And thats why we dont have AI washrooms because they are not alive or employees or have the need to excrete.