Comment by xpe
2 years ago
ChatGPT is a hybrid system; it isn't "just" an LLM any longer. What people associate with "LLM" is fluid. It changes over time.
So it is essential to clarify architecture when making claims about capabilities.
I'll start simple: Plain sequence to sequence feed-forward NN models are not Turing complete. Therefore they cannot do full reasoning, because that requires arbitrary chaining.
cGPT is exactly "just" an LLM though. a sparse MoE architecture is not an ensemble of experts.