Comment by NitpickLawyer
4 hours ago
> This project supports steering with single-vector activation directions; [...] This is also useful for cybersecurity researchers who want to reduce a model's willingness to provide dual-use or offensive security guidance.
Wink wink, nudge nudge.
I have a feeling most cybersec researchers would only be interested in negative values of "reduce" :D
No comments yet
Contribute on Hacker News ↗