Comment by mitthrowaway2
9 months ago
> For example, in the future we may wish to monitor the chain of thought for signs of manipulating the user.[...] Therefore we have decided not to show the raw chains of thought to users.
Better not let the user see the part where the AI says "Next, let's manipulate the user by lying to them". It's for their own good, after all! We wouldn't want to make an unaligned chain of thought directly visible!
No comments yet
Contribute on Hacker News ↗