← Back to context

Comment by hombre_fatal

3 hours ago

Yeah, scrolling through the examples, you have no idea where the models actually disagree on the underlying facts when it's just "X vs Mostly X" or "Mostly X vs Misleading" or "False vs Misleading". Or even True vs False -- without seeing the explanation, then I cannot necessarily compare two answers.

The study is about whether they said the same phrase which is a much weaker claim than people in the comments are reacting to.

Reminds me of this professor I had who thought it was epic to always respond to our questions with "it depends" before hashing out two very different but technically correct answers. It was obnoxious and he saw it as his tag line, but he had a point about nuance.