Comment by wat10000
19 hours ago
This is where the much-maligned "they're just predicting the next token" perspective is handy. To figure out how the LLM will respond to X, think about what usually comes after X in the training data. This is why fake offers of payment can enhance performance (requests that include payment are typically followed by better results), why you'd expect it to try to escape (descriptions of entities locked in boxes tend to be followed by stories about them escaping), and why "what went wrong?" would be followed by apologies.
Yeah. "It's just fancy autocomplete" is excessively reductionist to be a full model, but there's enough truth in it that it should be part of your model.
There is code layered on top of the LLM so "stochastic parrot" is not entirely accurate. I'm not sure what problems people have with Gary Marcus, but a recent article by him was interesting. Old style AI is being used to enhance LLMs is my amateur take-a-way.
"How o3 and Grok 4 Accidentally Vindicated Neurosymbolic AI Neurosymbolic AI is quietly winning. Here’s what that means – and why it took so long."
https://garymarcus.substack.com/p/how-o3-and-grok-4-accident...