Comment by delichon

11 hours ago

I have had conversations where the bot started with a firm opinion but reversed in a prompt or two, always toward my point of view.

So I asked it if the sycophancy is inherent in the design, or if it just comes from the RLHF. It claimed that it's all about the RLHF, and that the sycophancy is a business decision that is a compromise of a variety of forces.

Is that right? It would at least mean that this is technically a solvable problem.

2 comments

delichon

abandonliberty 4 hours ago

You can get vastly different outputs claiming that the input is yours or not.

jgilias 9 hours ago

I don’t think it is. The thing that needs to be kept in mind is that at the end of the day the basic building block of the AI systems is a fancy autocomplete. And I’m not saying this to somehow diminish it. It just means that it’s going to produce the statistically most likely continuation to a given source text. So if you keep pressing on with your point of view, it gets more and more likely for the statistically likely conversation to start agreeing with you. Unless there’s something in the context window that makes you obviously wrong.