← Back to context

Comment by F7F7F7

7 months ago

Is there a person on HackerNews that doesn’t understand this by now? We all collectively get it and accept it, LLMs are gigantic probability machines or something.

That’s not what people are arguing.

The point is, if given access to the mechanisms to do disastrous thing X, it will do it.

No one thinks that it can think in the human sense. Or that it feels.

Extreme example to make the point: if we created an API to launch nukes. Are yoh certain that something it interprets (tokenizes, whatever) is not going to convince it to utilize the API 2 times out of 100?

If we put an exploitable (documented, unpatched 0 day bug bug) safe guard in its way. Are you trusting that ME or YOU couldn’t talk it into attempting to access that document to exploit the bug, bypass the safeguard and access the API?

Again, no one thinks that it’s actually thinking. But today as I happily gave Claude write access to my GitHub account I realized how just one command misinterpreted command could go completely wrong without the appropriate measures.

Do I think Claude is sentient and thinking about how to destroy my repos? No.

I think the other guy is making the point that because they are probabalistic, they will always have some cases select the output that lies and covers it up. I don't think they're dismissing the paper based on the probabalistic nature of LLMs, but rather saying the outcome should be expected.

> if we created an API to launch nukes

> today as I happily gave Claude write access to my GitHub account

I would say: don’t do these things?

  • > I would say: don’t do these things?

    Hey guys let’s just stop writing code that is susceptible to SQL injection! Phew glad we solved that one.

    • I'm not sure what point you're trying to make. This is a new technology; it has not been a part of critical systems until now. Since the risks are blindingly obvious, let's not make it one.

      5 replies →

  • If you cross the street 999 times with eyes closed, you might feel comfortable to do it again. But we are ingrained not to do that once. We just understand the risk.

    If you do the same with an AI, after 999 times of nothing bad happening, you probably just continue giving it more risky agency.

    Because we don't and even can't understand the internal behavior, we should pause, make an effort of understanding its risks etc before even attempting to give it risky agency. That's where all the fuss is about, for good reasons.

  • Everybody wants the models to be able to execute code and access the Web. That's enough to cause harm.

  • > I would say: don’t do these things?

    That becomes a much fuzzier question in the context of Turing complete collections of tools and/or generative tools.

> Again, no one thinks that it’s actually thinking

I dunno, quite a lot of people are spending a lot of time arguing about what "thinking" means.

Something something submarines swimming something.