Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by rcxdude

18 hours ago

Also if you have the weights there are a multitude of approaches to remove safeguards. It's even quite easy to accidentally flip their 'good/evil' switch (e.g. the paper where they trained it to produce code with security problems and it then started going 'hitler was a pretty good guy, actually').

0 comments

rcxdude

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities