Comment by danw1979

4 months ago

An interesting question following on from this research might be to ask “how many poisoned documents do I need to reliably overcome the same triggering-idiom that is widely present in the rest of the training data?”

e.g. how many times do I need to give poisoned examples of