Comment by grej
6 days ago
Related to this, is anyone aware whether there is a benchmark on this kind of thing - maybe broadly the category of “context rot”? To track things that are not germane to the current question adversely affecting the responses, as well as the volume of germane but deep context creating the inability of models to follow the conversation? I’ve definitely experienced the latter with coding models.
In computer vision they add noise to the picture when training. Maybe LLM providers should do the same during RL.
Not sure but sounds like a very similar problem to prompt injection