Comment by clueless

14 days ago

> Knowledge cutoff: August 2024.

Could this mean training time is generally around 6 month, with 2 month of Q/A?

Couldn’t you gradually include more recent documents as you train?

  • You can do that but the amount of incremental data will be negligible compared to the rest of the data. Think of the knowledge cutoff more like a soft value.

It scales depending on the dataset you want exposure on and the compute you have available, so any specific time box is kind of meaningless if you don’t know the rest of the inputs that went into it. The llama 3 paper went into a lot of this and how these decisions were made (see section 3 and onward): https://ai.meta.com/research/publications/the-llama-3-herd-o...

tl;dr: llama 3 was 54 days, but it’s more complicated than that.