Comment by EGreg
8 months ago
What did Zuck mean that Llama 4 Behemoth is already the highest performing base model and hasnt even done training yet? What are the benchmarks then?
Does he mean they did pretraining but not fine tuning?
8 months ago
What did Zuck mean that Llama 4 Behemoth is already the highest performing base model and hasnt even done training yet? What are the benchmarks then?
Does he mean they did pretraining but not fine tuning?
You can fine tune a checkpoint of model during pre-training.