Comment by Der_Einzige

3 months ago

"This sampling strategy ... [is] terribly misguided, because they don't fix the underlying mode collapse... If you naively suppress repetitive n-grams ... it will just slip out at the first chance, spamming you with minor non-repetitive variations of the same high-level idea."

This is a colossal strawman! You're confusing two completely different problems:

One is Semantic Mode Collapse, which is when the model is genuinely stuck on a handful of high-level concepts and can't think of anything new to say. This is a deep pre-training or alignment problem.

Two is linguistic Pattern Over-usage ("Slop"). The model has a rich internal distribution of ideas but has learned through RLHF or DPO that a few specific phrasings get the highest reward. This is a surface-level, but extremely annoying, problem for a wide variety of use-cases!

Our paper, Antislop, is explicitly designed to solve problem #2.

Your example of "You're absolutely right" becoming "You're spot-on" is what happens when you use a bad suppression technique. Antislop's method is far more sophisticated. Read the paper! The FTPO trainer is built on preference pairs where the "chosen" tokens are coherent alternatives sampled from the model's own distribution.

"You'll never have the actual semantic variety unless you fix mode collapse. Referencing n-grams or manually constructed regexes as a source of semantical diversity automatically makes the method invalid..."

You write like you are someone who thinks "n-gram" is a dirty word and stopped reading there.

First, the patterns aren't "manually constructed." From Section 3.1, they are identified statistically by finding phrases that are massively overrepresented in LLM text compared to pre-2022 human text. We did data-driven forensics...

Also, ourpaper's method explicitly relies on good sampling techniques to find diverse alternatives. From Section 4.1:

"...we then resample from the adjusted distribution, using min-p filtering to constrain the distribution to coherent candidates..."

It's frankly insane that you and half the field are still ignoring this. The reason models produce repetitive "slop" in the first place is that everyone is running them at temperature=0.7 and top_p=0.9. Those settings cause bland and mean-chasing output, and you think that models exhibit this in generality because the whole field refuses to use much higher temperatures and better sampling settings.

You want real diversity? You crank the temperature to 5.0 or higher to flatten the distribution and then use min_p sampling (like the one introduced by Nguyen et al., cited in this very paper!) or an even better one like top N sigma to cut off the incoherent tail. This gives the model access to its full creative range.

I can't believe that after all this time you persist in this and don't see the obvious issue that's been pointed at multiple times.

The only "obvious issue" here is a failure to read the paper past the abstract. This paper's entire methodology is a direct refutation of the simplistic n-gram banning you imagine. FTPO works on the logit level with careful regularization (Figure 4b) to avoid the exact kind of model degradation you're worried about. FTPO maintains MMLU/GSM8K scores and improves lexical diversity, while DPO tanks it.