← Back to context

Comment by minimaxir

1 year ago

Safety is both the system prompt and the RLHF posttraining to refuse to answer adversarial inputs.

0 comments

minimaxir

Reply

No comments yet

Contribute on Hacker News ↗