← Back to context

Comment by Frost1x

2 years ago

>I'm unsure where this expectation of 100% absolute correctness comes from. I'm sure there are use cases, but I assume it's the vast minority and most can tolerate larger than expected inaccuracies.

As others hinted at, there's some bias because it's coming from a computer, but I think it's far more nuanced than that.

I've worked with many experts and professionals through my career ranging across medicine, various types of engineers, scientists, academics, researchers and so on and the pattern I often see is the level of certainty presented that always bothers me and the same is often embedded in LLM responses.

While humans don't typically quantify the certainty of their statements, the best SMEs I've ever worked with make it very clear what level of certainty they have when making professional statements. The SMEs who seem to be more often wrong than not speak in certainty quite often (some of this is due to cultural pressures and expectations surrounding being an "expert").

In this case, I would expect a seasoned scientist to say something in response to the duck question that: "many rubber ducks exist and are designed to float, this one very well might, we'd really need to test it or have far more information about the composition of the duck, the design, the medium we want it in (Water? Mecury? Helium?)" and so on. It's not an exact answer but you understand there's uncertainty there and we need to better clarify our question and the information surrounding that question. The fact is, it's really complex to know if it'll float or not from visual information alone.

It could have an osmimum ball inside that overcomes most the assumed buoyancy the material contains, including the air demonstrated to make it squeak. It's not transparent. You don't know for sure and the easiest way to alleviate uncertainty in this case is simply to test it.

There's so much uncertainty in the world, around what seem like the most certain and obvious things. LLMs seem to have grabbed some of this bad behavior from human language and culture where projecting confidence is often better (for humans) than being correct.