Comment by gbalduzzi
7 hours ago
Aside from LLM architecture, that already is a complex issue, an issue is that training data is unstructured text.
An LLM able to structurally separate context and instructions, should logically need separated data to train, and we don't have it.
Moreover, while an equally powerful LLM architecture solving this may exists, there are no guarantees at all that we are able to come up with it in a reasonable timeframe.
Without some signals moving in that direction, the most pragmatic and realistic way of looking at the problem is that it will not be solved in the near future
Thanks, I appreciate the thoughtful reply.
I agree this doesn't mean we shouldn't try to address limitations with the current architecture. I just mean that I expect the root cause to be solved eventually if we ever really want to take steps towards AGI.
Regarding signals moving in that direction, here's a paper you might enjoy https://arxiv.org/abs/2503.21937