Comment by moffkalast
4 hours ago
The outrage is less about them having human behaviours I think, and more about still having them while omitting the internal processes that are required to accurately (and reliably) recreate them. It's fundamentally fragile and hinges on covering edge cases that break the spell manually instead of good generalization, and there's always another edge case.
Training on a bunch of text someone wrote when they were mad doesn't capture the internal state of that person that caused the outburst, so it cannot be accurately reproduced by the system. The data does not exist.
Without the cause to the effect you essentially have to predict hallucinations from noise, which makes the end result verisimilar nonsense that is convincingly correlated with the actual thing but doesn't know why it is the way it is. It's like training a blind man to describe a landscape based on lots of descriptions and no idea what the colour green even is, only that it's something that might appear next to brown in nature based on lots of examples. So the guy gets it kinda right cause he's heard a description of that town before and we think he's actually seeing and tell him to drive a car next.
Another example would say, you're trying to train a time series model to predict the weather. You take the last 200 years of rainfall data, feed it all in, and ask it to predict what the weather's gonna be tomorrow. It will probably learn that certain parts of the year get more or less rain, that there will be rain after long periods of sun and vice versa, but its accuracy will be that of a coin toss because it does not look at the actual factors that influence rain: temperature, pressure, humidity, wind, cloud coverage radar data. Even with all that info it's still gonna be pretty bad, but at least an educated guess instead of an almost random one.
The DL modelling approach itself is not conceptually wrong, the data just happens to be complete garbage so the end result is weird in ways that are hard to predict and correctly account for. We end up assuming the models know more than they realistically ever can. Sure there are cases where it's possible to capture the entire domain with a dataset, i.e. math, abstract programming. Clearly defined closed systems where we can generate as much synthetic data as needed that covers the entire problem domain. And LLMs expectedly do much better in those when you do actually do that.
No comments yet
Contribute on Hacker News ↗