Comment by lxgr

2 years ago

They're lying in the same way that a sign that says "free cookies" is lying when there are actually no cookies.

I think this is a different usage of the word, and we're pretty used to making the distinction, but it gets confusing with LLMs.

You are making an imaginary distinction that doesn't exist. It doesn't even make any sense in the context of the paper i linked.

The model consistently and purposefully withheld knowledge it was directly aware of. This is lying under any useful definition of the word. You're veering off into meaningless philosophy that has no bearing on outcomes and results.