Comment by kubb

12 hours ago

This is condescending and wrong at the same time (best combo).

LLMs do stumble into long prediction chains that don’t lead the inference in any useful direction, wasting tokens and compute.

4 comments

kubb

Are you sure about that? Chain of thought does not need to be semantically useful to improve LLM performance. https://arxiv.org/abs/2404.15758

kubb 4 hours ago

If you're misusing LLMs to solve TC^0 problems, which is what the paper is about, then... you also don't need the slop lavine. You can just inject a bunch of filler tokens yourself.
davidguetta 10 hours ago
still doesn't mean all tokens are useful. it's the point of benchmarks
- prodigycorp 10 hours ago
  
  Care to share the benchmarks backing the claims in this repo?