Comment by buyucu

5 months ago

Yes.

Before LLMs we had N-Gram language models. Many tasks like speech recognition worked as beach search in the graph defined by the ngram language model. You could easily get huge accuracy gains simply by pruning your beam less.

s1 reminds of this. You can always trade off latency for accuracy. Given these LLMs are much more complex than good old N-Grams, we're just discovering how to do this trade.

1 comment

buyucu

bloomingkales 5 months ago

Let me carry that concept, “learning to do this trade”, it’s a new trade.

I don’t believe computer science has the algorithms to handle this new paradigm. Everything was about sequential deterministic outputs, and clever ways to do it fast. This stuff is useless at the moment. We need new thinkers on how to not think sequentially or how not to think about the universe in such a small way.

Verifying input/output pairs is the old way. We need to understand differently going forward.