Comment by _heimdall
3 hours ago
I'm still not sure what safeguards they can be adding here. Unless they've suddenly solved alignment, at best isn't it a collection of system prompts saying what not to do and potentially some screening algorithms that try to catch key phrases in inputs/outputs?
No comments yet
Contribute on Hacker News ↗