Comment by latentsea
9 days ago
> We might be incentivizing answers that sound right with reinforcement learning as opposed to answers that are actually right.
We do this with other humans, so I don't know that we know how to avoid doing the same with machines.
No comments yet
Contribute on Hacker News ↗