Comment by red75prime
3 hours ago
Reasoning allows to produce statements that are more likely to be true based on statements that are known to be true. You'd need to structure your "falsehood training data" in a specific way to allow an LLM to generalize as well as with the regular data (instead of memorizing noise). And then you'll get a reasoning model which remembers false premises.
You generate your text based on a "stochastic parrot" hypothesis with no post-validation it seems.
No comments yet
Contribute on Hacker News ↗