← Back to context

Comment by pabs3

8 hours ago

Its unlikely all the training data for Llama is publicly available, let alone under an open source license. If Llama actually had an open source license (IIRC it doesn't), that would still make it a Toxic Candy model under the Debian Deep Learning Team's Machine Learning policy. That means no-one could replicate it exactly, even if they had the boatloads of cash it would take to buy enough hardware and electricity to do the training. Eventually the community could maybe find or create enough data, but that would be a new different model.

https://salsa.debian.org/deeplearning-team/ml-policy