Comment by peyton
8 hours ago
This is great and has lots of practical stuff.
Some of the takeaways feel over-reliant on implementation details that don’t capture intent. E.g. something like “the LLM is just trying to predict the next word” sort of has the explanatory power of “your computer works because it’s just using binary”—like yeah, sure, practically speaking yes—but that’s just the most efficient way to lay out their respective architectures and could conceivably be changed from under you.
I wonder if something like neural style transfer would work as a prelude. It helps me. Don’t know how you’d introduce it, but with NST you have two objectives—content loss and style loss—and can see pretty quickly visually how the model balances between the two and where it fails.
The bigger picture here is that people came up with a bunch of desirable properties they wanted to see and a way to automatically fulfill some of them some of the time by looking at lots of examples, and it’s why you get a text box that can write like Shakespeare but can’t tell you whether your grandma will be okay after she went to the hospital an hour ago.
Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
> Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
Are there newer changes that are actually doing prediction of tokens out of order or such, or are this a case of immense internal model state tracking but still using it to drive the prediction of a next token, one at a time?
(Wrapped in a variety of tooling/prompts/meta-prompts to further shape what sorts of paragraphs are produced compared to ye olden days of the gpt3 chat completion api.)