Comment by MattDamonSpace
2 hours ago
“So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things”
What’s the punishment here exactly?
2 hours ago
“So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things”
What’s the punishment here exactly?
Higher odds of being banned for legitimate usage.
Output poisoning and/or eventual account bans, if I had to guess.
Returning invalid poisoned different results that were not what you paid for
They probably run a heavily dumbed down version of the model, same as what they got caught doing with Fable.
And that's also why, as a legitimate customer, want none of it, you never know if you accidentally entered a zone they don't like.
"got caught"
to clarify, this behavior was announced with the model release
The extent got caught.
if by announce you mean shove it somewhere in a pdf with hundreds of pages, yes
1 reply →
> What’s the punishment here exactly?
Seeing as how Anthropic cannot stop raising a stink about "illicit Chinese distillation attacks" every month or so, I'd bet money on them either already silently degrading model performance if any of the identification patterns match, or, at the very least, considering it/doing dry runs.