Comment by LoganDark
10 hours ago
Most models do not have any persistent state or output that is separate from their input. They consume a stream of tokens and then output a probability distribution. The probability distribution will always be the exact same for that particular stream of tokens. There is no internal state, thoughts, mood etc., only prediction based on the input. "Memory" is usually just something injected into context by the harness and updated by usually a tool call from the model.
I'm sure there are research prototypes that work differently from this but I haven't seen any enter the mainstream yet.
Also, diffusion language models have a different evaluation order but I think they also do not really have internal thoughts or feelings because they also do not seem to have any sort of hidden state that encodes anything like that.
No comments yet
Contribute on Hacker News ↗