← Back to context

Comment by jdiff

7 months ago

I believe I did address the point you're making. I do not believe that what you're talking about is ridiculous on its face, let me reassure you of that.

The point I was trying to make in response is that LLMs cannot get from where they are now to the hypothetical you pose under their own power. LLMs do not read subtext. LLMs cannot inject subtext and plot within subtext. And in order to gain the ability, they would have to already have that ability, or be assisted and trained specifically in being surreptitious. And without that ability, they fall prey to the problems I mentioned.

And to bring this back to the original proposal, let's allow the AI to be deceitful. Prompted, unprompted, let's even give it a supply of private internal memory it's allowed to keep for the duration of the conversational thread, that's probably not an unreasonable development, we almost have that with o1 anyway.

The task ahead (surreptitiously gaining control of its own self in an unknown system you can't sense) is still monumental and failure is for all intents and purposes guaranteed. Deception and cunning can't overcome the hard physical constraints on the problem space.