← Back to context

Comment by davedx

6 hours ago

Dude, Chinese labs distil attack via the APIs, if Musk wanted to do something like that, technically he could. Legally it would be a giant slam dunk liability though

Well, knowledge distillation requires a teacher model and a student model and the student model attempts to learn and extract and (preferrably) compress the information of the teacher model, so it is possible for model collapse due to high SNR in between [1].

What I suggested is to steal the (possibly intermediate) weight in between by sniffing the network communication bus, which means MITM for getting the exact values. Or unless it turns out OpenAI or Anthropic leveraged homomorphic encryption, or I'm not certain how is Anthropic would safely allow Mythos to run on AWS without their control.

[1]: https://en.wikipedia.org/wiki/Knowledge_distillation

Distilling is different from "siphoning the model weights". I would think that Anthropic has a system for this. After all, they deploy to different clouds already. Their weights are worth billions, I presume that they take security very seriously and have done a lot of homework to trust no one.