Comment by datsci_est_2015

5 days ago

So why then do we stop training LLMs and keep them stored at a specific state? Is it perhaps because the results become terrible and LLMs have a delicate optimal state for general use? This sounds like an even worse case for a model of intelligence.

4 comments

datsci_est_2015

stavros 5 days ago

Nope, it's not that, but it's nice of you to offer a straw man. Makes the argument flow better.

datsci_est_2015 5 days ago
Not entirely a straw man. What is the purpose of storing and retrieving LLMs at a fixed state if not to guarantee a specific performance? Wouldn’t a strong model of intelligence be capable of, to extend your analogy, running without having its hippocampus lobotomized?
Given the precariousness of managing LLM context windows, I don’t think it’s particularly unfair to assume that LLMs that learn without limit become very unstable.
To steelman, if it’s possible, it may be prohibitively expensive. But somehow I doubt it’s possible.
- stavros 5 days ago
  
  It is, indeed, prohibitively expensive. But it's not impossible. The proof is in the fact that you can fine-tune LLMs.