Comment by tomohelix
12 hours ago
Technically, the models can already learn on the fly. Just that the knowledge it can learn is limited to the context length. It cannot, to use the trendy word, "grok" it and internally adjust the weights in its neural network yet.
To change this you would either need to let the model retrain itself every time it receives new information, or to have such a great context length that there is no effective difference. I suspect even meat models like our brains is still struggling to do this effectively and need a long rest cycle (i.e. sleep) to handle it. So the problem is inherently more difficult to solve than just "thinking". We may even need an entire new architecture different from the neural network to achieve this.
> Technically, the models can already learn on the fly. Just that the knowledge it can learn is limited to the context length.
Isn't that just improving the prompt to the non-learning model?
Only small problem is that models are neither thinking nor understanding, I am not sure how this kind of wording is allowed with these models.