Comment by henry2023
18 hours ago
Wasn’t Llama a leak that got so popular meta decided to change their whole approach?
I was working at Google at the time. Before Llama, releasing weights was not even worth a discussion.
18 hours ago
Wasn’t Llama a leak that got so popular meta decided to change their whole approach?
I was working at Google at the time. Before Llama, releasing weights was not even worth a discussion.
If I'm remembering right, it was weirder than that, as Llama's originally release strategy was sort of bizarre.
You did have to apply for access, but if you met their criteria (basically if you were the right profile of researcher or in government), you got direct access to the model weights, not just an API for a hosted model. So access was restricted, but the full weights were shared.
I believe that the model was leaked by multiple people, some of which didn't work at Meta but had been granted access to the weights.
I don't think it was much of a leak, there was no significant verification--I think .edu email approved automatically. It was probably just there because copyright law related to it was more uncertain then, but they had stronger fair use case with an academic/research gate on it.
Not sure, but open weights have had their effects. For example, look at Wan 2.2 the last open weights Wan release, still the most powerful Video inference out there, to the level of quality it provides, unfortunately, it went closed source, but before they did, the community had built all sorts of tooling and LoRas on top of it. Nothing comes close for video a year later. Back to llama though, look at all the open models people run offline through their Macs. It definitely had a net positive.
I'm curious how this view fits in with BERT or the T5 release which prior to the current LLM craze were the de facto language models for use in pretty much any tasks. Was this a position that would've otherwise grown without the llama release?