Comment by rybosome

8 months ago

A wonderful approach generally and something we also do to some extent, but not a substitute for fine-tuning in our case.

We are working in a domain where there is very limited training data, so what we really want is continued pre-training over a larger dataset. Absent that, fine-tuning is highly effective for non-NLP tasks.