Comment by crazygringo
2 years ago
EDIT: never mind, I missed the exact wording about being "made of a material..." which is definitely false then. Thanks for the correction below.
Preserving the original comment so the replies make sense:
---
I think it's a stretch to say that's false.
In a conversational human context, saying it's made of rubber implies it's a rubber shell with air inside.
It floats because it's rubber [with air] as opposed to being a ceramic figurine or painted metal.
I can imagine most non-physicist humans saying it floats because it's rubber.
By analogy, we talk about houses being "made of wood" when everybody knows they're made of plenty of other materials too. But the context is instead of brick or stone or concrete. It's not false to say a house is made of wood.
> In a conversational human context, saying it's made of rubber implies it's a rubber shell with air inside.
Disagree. It could easily be solid rubber. Also, it's not made of rubber, and the model didn't claim it was made of rubber either, so it's irrelevant.
> It floats because it's rubber [with air] as opposed to being a ceramic figurine or painted metal.
A ceramic figurine or painted metal in the same shape would float too. The claim that it floats because of the density of the material is false. It floats because the shape is hollow.
> It's not false to say a house is made of wood.
It's false to say a house is made of air simply because its shape contains air.
This is what the reply was:
> Oh, it it's squeaking then it's definitely going to float.
> It is a rubber duck.
> It is made of a material that is less dense than water.
Full points for saying if it's squeaking then it's going to float.
Full points for saying it's a rubber duck, with the implication that rubber ducks float.
Even with all that context though, I don't see how "it is made of a material that is less dense than water" scores any points at all.
Yeah, I think arguing the logic behind these responses misses the point, since an LLM doesn't use any kind of logic--it just responds in a pattern that mimics the way people respond. It says "it is made of a material that is less dense than water" because that is a thing that is similar to what the samples in its training corpus have said. It has no way to judge whether it is correct, or even what the concept of "correct" is.
When we're grading the "correctness" of these answers, we're really just judging the average correctness of Google's training data.
Maybe the next step in making LLM's more "correct" is not to give them more training data, but to find a way to remove the bad training data from the set?