Comment by layer8
1 year ago
That struck me as well. While the training data is biased in various ways (like media in general are), it should however also contain enough information for the AI to be able to judge reasonably well what a less biased reality-reflecting balance would be. For example, it should know that there are male nurses, black politicians, etc., and represent that appropriately. Black Nazi soldiers are so far out that it sheds doubt on either the AI’s world model in the first place, or on the ability to apply controlled corrections with sufficient precision.
You are literally saying that the training data, despite its bias, should somehow enable the AI to correct to acheive a different understanding than that bias, which is self-contradictory. You are literally suggesting that the data both omits and contains the same information.
I wonder if we’ll ever get something like ‘AI-recursion’, where you get an AI to apply specific transformations to data which is then used to train on, sort of like machines making better machines.
E.g. take some data A, and then have a model (for instance ChatGPT-like) extrapolate based on it, potentially adding new depths or details about the given data.
Apparently the biases in the output tend to be stronger than what is in the training set. Or so I read.