← Back to context

Comment by mpalmer

15 days ago

Bias implies an offset from something. It's relative. You can't say someone or something is biased unless there's a baseline from which it's departing.

All right, let's say that the baseline is "what is true". Then bias is departure from the truth.

That sounds great, right up until you try to do something with it. You want your LLM to be unbiased? So you're only going to train it on the truth? Where are you going to find that truth? Oh, humans are going to determine it? Well, first, where are you going to find unbiased humans? And, second, they're going to curate all the training data? How many centuries will that take? We're trying to train it in a few months.

And then you get to things like politics and sociology. What is the truth in politics? Yeah, I know, a bunch of politicians say things that are definitely lies. But did Obamacare go too far, or not far enough, or was it just right? There is no "true" answer to that. And yet, discussions about Obamacare may be more or less biased. How are you going to determine what that bias is when there isn't a specific thing you can point to and say, "That is true"?

So instead, they just train LLMs on a large chunk of the internet. Well, that includes things like the fine-sounding-but-completely-bogus arguments of flat earthers. In that environment, "bias" is "departure from average or median". That is the most it can mean. So truth is determined by majority vote of websites. That's not a very good epistemology.

  • The definition of the word has no responsibility to your opinion of it as an epistemology.

    Also, you're just complaining about the difficulty of determining what is true. That's a separate problem, isn't it?

    • If we had an authoritative way of determining truth, then we wouldn't have the problem of curating material to train an LLM on. So no, I don't think it's a separate problem.

      3 replies →

"Unbiased" would be a complete and detailed recitation of all of the facts surrounding an incident, arguably down to particles. Anything less introduces some kind of bias. For instance, describing an event as an interaction of people, omitting particles/field details, introduces human bias. That's a natural and useful bias we don't typically care about but does come into play in science.

Political bias creeps in when even the human description of events omits facts that are inconvenient or that people consider irrelevant due to their political commitments.

Any option you choose is biased relative to the option(s) you didn’t choose. There doesn’t have to be an objective baseline.

Someone might say they are biased towards the color orange and that means they have a preference relative to all the other colors. But there is no baseline color.

  • The baseline is a neutral stance on orange. The option isn't biased, a choice isn't biased. The chooser is.