Comment by HenryMulligan
1 day ago
Ignoring what this model architecture could do and just considering what this model does do, why would I (or anyone) want to run this model (locally) to do <insert use-case>? Is it entirely a proof-of-concept for future training on medical data? Are they looking to use this to attempt to ethically justify training on (free-tier) user's personal data via the application of noise to the training data?
You can hide that you pirated content for training
You can't hide that. You can't use technical measures to hide from discovery.
I think an entire book is a little too large to mask with this method and still end up learning anything.
U can avoid book publisher lawsuit which Anthropic is dealing with using this approach
It's the last option.
The whole framing of DP is:
Probability that you reveal private info is same whether or not you train on a particular users data.
It is useful in many cases, but google the product company specifically is going to use it for ads.
The purpose is research