Comment by Rohansi

3 days ago

> Closed or open source doesn't matter; it's the ability to control them that's important. People have been cracking and patching for decades without source, but they have that control.

You have no idea what has been baked into the weights in the training process. In theory you could find biases and attempt to "patch" them out, but its a vastly different process vs. patching machine code.

Consider what would happen if Google's open weight models were best at writing code targeting Google's services vs. their competitors? Is this something that could be patched? What if there were more subtle differences that you only notice much later after some statistical analysis?

10 comments

Rohansi

narrator 3 days ago

People are already patching these models using abliteration to prevent them from refusing any request, so it is possible for end users to change them in meaningful ways. You can download abliterated models right now from Hugging Face that will respond to all kinds of requests that frontier models refuse.

Rohansi 2 days ago

The problem is you can't reverse engineer what was baked into the weights because they are just weights. You'll never know if you've fixed everything because it's not always going to be as obvious as request refusal. It's also not binary where you can fully confirm something is fixed or if you've accidentally affected something else.
They're for sure impressive but I don't see how anyone can push them as "open" when they are literally binary blobs. Worse, because it's not practical for anyone to actually train LLMs that can even come close to competing with the ones corporations are pumping out.
hparadiz 3 days ago
Yup there's a ton of people on HN sleeping on this new tech because they refuse to look at anything AI. We now have jail broken models but the average person on here doesn't even know how to download and try a model.
- CableNinja 3 days ago
  
  It doesnt help that guides ive seen have been pretty handwavy or are not specific enough to the individual situation (i have z hardware, heres how its done). It also doesnt help when every post on HN i see is like 'oh waow i did x on a mac mini with 128gb ram'. That spec is beyond many, running on generally available resources (such as hardware one might have laying around their house) do not seem fit for the purpose, so its back to building a new machine (gl when ram is worth 2x its weight in gold), or buying a $1000+ mac mini, or other device. Any low end system cant turn out tokens fast enough, or doesnt have the resources for context or processing.
  Local ai is not ready, and if you think it is, prove me wrong with a detailed guide running commodity hardware with complete setup steps that can use a decently sized model.
  I spent 2 weeks trying to get anything running - 8gb RX550XT, 12gb ram, 8core cpu. I even tried turboquant to lower memory utilization and still couldnt even get a 3B or 4B model loaded, and anything lower wont suit my needs (3/4B are even pushing it).
  
  5 replies →

woctordho 2 days ago

There are a ton of methods to probe training data from a trained model, both for open and closed-source models.