← Back to context

Comment by hatefulmoron

9 days ago

Maybe someone can explain it to me, but isn't that slide sort of just describing what makes solving problems hard in general? That there are many more decisions which put you on an inevitable path of failure?

"Probability e that any produced [choice] takes us outside the set of correct answers .. probability that answer of length n is correct: P(correct) = (1-e)^{n}"

I think he's focusing on the distinction between facts and output for humans and drawing a parallel to LLMs.

If I ask you something that you know the answer to, the words you use and that fact iself are distinct entities. You're just giving me a presentation layer for fact #74719.

But LLMs lack any comparable pool to draw from, and so their words and their answer are essentially the same thing.

The routing decision that an MoE model makes increases its chances of success by constraining its future paths.