Comment by tbrake

7 months ago

Well, almost always.

There was that brief period in 2023 when Bing just started straight up gaslighting people instead of admitting it was wrong.

https://www.theverge.com/2023/2/15/23599072/microsoft-ai-bin...

4 comments

tbrake

astrange 7 months ago

I suspect what happened there is they had a filter on top of the model that changed its dialogue (IIRC there were a lot of extra emojis) and it drove it "insane" because that meant its responses were all out of its own distribution.

You could see the same thing with Golden Gate Claude; it had a lot of anxiety about not being able to answer questions normally.

int_19h 7 months ago
Nope, it was entirely due to the prompt they used. It was very long and basically tried to cover all the various corner cases they thought up... and it ended up being too complicated and self-contradictory in real world use.
Kind of like that episode in Robocop where the OCP committee rewrites his original four directives with several hundred: https://www.youtube.com/watch?v=Yr1lgfqygio
- astrange 7 months ago
  
  That's a movie though. You can't drive an LLM insane by giving it self-contradictory instructions; they'd just average out.
  
  1 reply →