Comment by giancarlostoro
16 hours ago
Funny when you consider the world owes a lot of AI advancements to both Meta and Google, their open releases really did shift things, feel free to correct me if I'm wrong, especially for China, which as far as I know were not releasing as much in AI as they have been beforehand. I remember when Meta released Llama originally people were speculating about it, but it wound up producing a lot of projects that used it, I'm sure some in China. I know that Perplexity has its own custom model on top of Llama that they use for their default model, and its pretty darn good.
Wasn’t Llama a leak that got so popular meta decided to change their whole approach?
I was working at Google at the time. Before Llama, releasing weights was not even worth a discussion.
If I'm remembering right, it was weirder than that, as Llama's originally release strategy was sort of bizarre.
You did have to apply for access, but if you met their criteria (basically if you were the right profile of researcher or in government), you got direct access to the model weights, not just an API for a hosted model. So access was restricted, but the full weights were shared.
I believe that the model was leaked by multiple people, some of which didn't work at Meta but had been granted access to the weights.
I don't think it was much of a leak, there was no significant verification--I think .edu email approved automatically. It was probably just there because copyright law related to it was more uncertain then, but they had stronger fair use case with an academic/research gate on it.
Not sure, but open weights have had their effects. For example, look at Wan 2.2 the last open weights Wan release, still the most powerful Video inference out there, to the level of quality it provides, unfortunately, it went closed source, but before they did, the community had built all sorts of tooling and LoRas on top of it. Nothing comes close for video a year later. Back to llama though, look at all the open models people run offline through their Macs. It definitely had a net positive.
I'm curious how this view fits in with BERT or the T5 release which prior to the current LLM craze were the de facto language models for use in pretty much any tasks. Was this a position that would've otherwise grown without the llama release?
> Funny when you consider the world owes a lot of AI advancements to both Meta and Google
Funny how ByteDance kicked both their asses so hard at RecSys algos, they had to go back to the drawing board to meet the newly redefined expectations on the quality of short-form video recommendations.
Did they though? That is the lore. You can’t really compare recommender system performance across different populations and products.
Unlike common benchmarks for LLMs.
Also Chinese companies are now single-handedly keeping the future of LLMs open-sourced. DeepSeek being the pinnacle of this. Not only do they publish weights and code, but they publish detailed papers detailing their approach