Comment by mmooss

6 days ago

They pre-train with all data up to 1900 and then fine-tune with 1900-1913 data.

Where does it say that? I tried to find more detail. Thanks.