Comment by EGreg
13 days ago
What did Zuck mean that Llama 4 Behemoth is already the highest performing base model and hasnt even done training yet? What are the benchmarks then?
Does he mean they did pretraining but not fine tuning?
13 days ago
What did Zuck mean that Llama 4 Behemoth is already the highest performing base model and hasnt even done training yet? What are the benchmarks then?
Does he mean they did pretraining but not fine tuning?
You can fine tune a checkpoint of model during pre-training.