Comment by DoctorOetker

20 days ago

will catastrophic forgetting still occur if a fraction of the update sentences are the original training corpus?

is the real issue actually catastrophic forgetting or overfitting?

nothing prevents users from continuing the learning as they use a model

3 comments

DoctorOetker

Catastrophic forgetting is overfitting.

pankajdoharey 14 days ago

No, it’s actually the math of overwriting. Imagine you hiked down into a valley Task A and settled there. Then, you decide to climb a new mountain to find a different valley Task B. You successfully move to the new valley, but in doing so, you destroy the path back to the first one. You are now stuck in the new valley and have completely 'forgotten' how to get back to the first one.
davidguetta 19 days ago

not exactly, not at all even in term of the way the llm are trained.
In RL it can be that you are not getting meaningful data anymore because you are 'too good' and dont get anymore the "this is a bad answer" signal so you can't estimate the gradient.