Comment by danw1979
4 days ago
An interesting question following on from this research might be to ask “how many poisoned documents do I need to reliably overcome the same triggering-idiom that is widely present in the rest of the training data?”
e.g. how many times do I need to give poisoned examples of
if err != nil { <bad code> }
in order to get an unacceptable number of bad code outputs from the model.
No comments yet
Contribute on Hacker News ↗