Comment by stevefan1999
4 hours ago
Well, knowledge distillation requires a teacher model and a student model and the student model attempts to learn and extract and (preferrably) compress the information of the teacher model, so it is possible for model collapse due to high SNR in between [1].
What I suggested is to steal the (possibly intermediate) weight in between by sniffing the network communication bus, which means MITM for getting the exact values. Or unless it turns out OpenAI or Anthropic leveraged homomorphic encryption, or I'm not certain how is Anthropic would safely allow Mythos to run on AWS without their control.
No comments yet
Contribute on Hacker News ↗