← Back to context

Comment by DebtDeflation

1 year ago

Yes. It's called "catastrophic forgetting". These models were trained on trillions of tokens and then underwent a significant RLHF process. Fine tuning them on your tiny data set (relative to the original training data) almost always results in the model performing worse at everything else. There's also the issue of updating changed information. This is easy with RAG - replace the document in the repository with a new version and it just works. Not so easy with fine tuning since you can't identify and update just the weights that were changed (there's research in this area but it's early days).