Comment by heyitsguay
8 days ago
The attack surface area for local LLMs is much smaller than almost any program that you would download. Make sure you trust whatever LLM execution stack is being used (apparently MLX here? I'm not familiar with that one specifically), and then the amount of additional code associated with a given LLM should be tiny - most of it is a weight blob that may be tough to understand but can't really do anything nefarious, data just passes through it.
Again, not sure what MLX does but c.f. the files for DeepSeek-R1 on huggingface: https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main
Two files contain arbitrary executable code - one defines a simple config on top of a common config class, the other defines the model architecture. Even if you can't verify yourself that nothing sneaky is happening, it's easy for the community because the structure of valid config+model definition files is so tightly constrained - no network calls, no filesystem access, just definitions of (usually pytorch) model layers that get assembled into a computation graph. Anything deviating from that form is going to stand out. It's quite easy to analyze.
> and then the amount of additional code associated with a given LLM should be tiny
What about this reporting (which is a deserialization issue, it seems like)?
- https://www.wiz.io/blog/wiz-and-hugging-face-address-risks-t...
- https://jfrog.com/blog/data-scientists-targeted-by-malicious...
This project apparently uses MLX, Apple’s ML framework, which doesn’t use Python’s pickle library that’s behind the safety issue. There are several options for storing models/tensors in MLX, none of which I think have such (de-)serialization issues: https://ml-explore.github.io/mlx/build/html/usage/saving_and...