← Back to context

Comment by spywaregorilla

2 years ago

Seems unlikely given the model does pretty well on novel situations. If you ask someone to apply reasons to things they do not understand, you would expect them to get it wrong pretty consistently.

I don't know that it's significant to say that a model's representation of things tends to be good enough to generalize across things but isn't perfect under the hood. That applies to humans too.

The claim was that these models aren't making representations of the underlying reason for things. I guess I'm indifferent if you agree that they are, but that some of those reasons are not correct.