Comment by CGamesPlay
7 months ago
What you said about RAG makes sense, but my understanding is that fine-tuning is actually not very good at getting deeper understanding out of LLMs. It's more useful for teaching general instructions like output format rather than teaching deep concepts like a new domain of science.
This is true if you don't know what you're doing, so it is good advice for the vast majority.
Fine tuning is just training. You can completely change the model if you want make learn anything you want.
But there are MANY challenges in doing so.
This isn't true either, because if you don't have access to the original data set, the model will overfit on your fine tuning data set and (in the extreme cases) lose its ability to even do basic reasoning.
Yes. It's called "catastrophic forgetting". These models were trained on trillions of tokens and then underwent a significant RLHF process. Fine tuning them on your tiny data set (relative to the original training data) almost always results in the model performing worse at everything else. There's also the issue of updating changed information. This is easy with RAG - replace the document in the repository with a new version and it just works. Not so easy with fine tuning since you can't identify and update just the weights that were changed (there's research in this area but it's early days).
Again, that's why I said it is challenging.
I regularly do fine tuning on a model with fine results and little damage to the base functionality.
It is possible, but it's too complex for the majority of users. It requires a lot of work per dataset you want trained on.