← Back to context

Comment by overgard

1 day ago

I just can't help but imagine ChatGPT's sycophancy mixed with military operations. "Sharp insight bombing that wedding! Next would you like tips on mosques to bomb, or I can suggest some new napalm recipes that are extra spicey. Your call!"

Department of Defense: You just bombed the wrong Georgia! The people of Atlanta are furious!

ChatGPT: You're absolutely right, and you're right to call that out. Upon examination it does appear that there might have been a mistake with the coordinates of the bomb. Let's try again, this time we will double check before we launch any missiles! :missile emoji:

If Antrophic would have given in, I would have imagined the dialogs something like in claude CLI:

To complete the mission the war terminal needs to hit a target at XY:

1. yes

2. yes (and don't ask again for strike targets in this session)

3. no

Human in the loop is the term here I think.

(I am really glad they did not give in, but I do assume this is what it will come to anyway)

The point is it will be autonomous, the prompt could just be 'keep me safe' which will be interpreted who knows how and presumably no further prompting.