The "learning through making" approach is really good. When I've explained LLMs to non-technical people, the breakthrough moment is usually when they see temperature in action. High temperature = creative but chaotic, low temperature = predictable but boring. You can't just describe that.
What I'd add: the lesson about hallucinations should come early, not just in the RAG module. Kids (and adults) need to internalize "confident-sounding doesn't mean correct" before they get too comfortable. The gap between fluency and accuracy is the thing that trips everyone up.
I couldn’t find it easily, what age range is this intended for? The images make it seem elementary-school-ish, but I’m not sure if elementary school kids have the foundations for interpreting scatterplots, let alone scatterplots with logarithmic axes. I’ve been out of education for a while though, so maybe I’m misremembering.
That lesson plan is a good practical start. I think it misses the very big picture of what we've created and the awesomeness of it.
The simplest explanation I can give is we have a machine that you feed it some text from the internet, and you turn the crank. Most machines we've had previously would stop getting better at predicting the next word after a few thousand cranks. You can crank the crank on an LLM 10^20 times and it will still get smarter. It will get so smart so quickly that no human can fit in their mind all the complexity of what it's built inside of it except through indirect methods but we know that it's getting smarter through benchmarks and some reasonably simple proofs that it can simulate any electronic circuit. We only understand how it works by the smallest increment of its intelligence improvement and through induction understanding that that should lead to further improvements in its intelligence.
This is excellent intuition to have for how LLMs work, as well as understanding the implications of living in an LLM-powered world. Just as useful to adults as children.
Some of the takeaways feel over-reliant on implementation details that don’t capture intent. E.g. something like “the LLM is just trying to predict the next word” sort of has the explanatory power of “your computer works because it’s just using binary”—like yeah, sure, practically speaking yes—but that’s just the most efficient way to lay out their respective architectures and could conceivably be changed from under you.
I wonder if something like neural style transfer would work as a prelude. It helps me. Don’t know how you’d introduce it, but with NST you have two objectives—content loss and style loss—and can see pretty quickly visually how the model balances between the two and where it fails.
The bigger picture here is that people came up with a bunch of desirable properties they wanted to see and a way to automatically fulfill some of them some of the time by looking at lots of examples, and it’s why you get a text box that can write like Shakespeare but can’t tell you whether your grandma will be okay after she went to the hospital an hour ago.
Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
> Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
Are there newer changes that are actually doing prediction of tokens out of order or such, or are this a case of immense internal model state tracking but still using it to drive the prediction of a next token, one at a time?
(Wrapped in a variety of tooling/prompts/meta-prompts to further shape what sorts of paragraphs are produced compared to ye olden days of the gpt3 chat completion api.)
The "learning through making" approach is really good. When I've explained LLMs to non-technical people, the breakthrough moment is usually when they see temperature in action. High temperature = creative but chaotic, low temperature = predictable but boring. You can't just describe that.
What I'd add: the lesson about hallucinations should come early, not just in the RAG module. Kids (and adults) need to internalize "confident-sounding doesn't mean correct" before they get too comfortable. The gap between fluency and accuracy is the thing that trips everyone up.
I think this is a backwards approach, especially for children.
Gen AI is magical, it makes stuff appear out of thin air!
And it's limited, everything it makes kinda looks the same
And it's forgetful, it doesn't remember what it just did
And it's dangerous! It can make things that never happened
Starting with theory might be the simplest way to explain, but it leaves out the hook. Why should they care?
I couldn’t find it easily, what age range is this intended for? The images make it seem elementary-school-ish, but I’m not sure if elementary school kids have the foundations for interpreting scatterplots, let alone scatterplots with logarithmic axes. I’ve been out of education for a while though, so maybe I’m misremembering.
Logarithms are high school.
That lesson plan is a good practical start. I think it misses the very big picture of what we've created and the awesomeness of it.
The simplest explanation I can give is we have a machine that you feed it some text from the internet, and you turn the crank. Most machines we've had previously would stop getting better at predicting the next word after a few thousand cranks. You can crank the crank on an LLM 10^20 times and it will still get smarter. It will get so smart so quickly that no human can fit in their mind all the complexity of what it's built inside of it except through indirect methods but we know that it's getting smarter through benchmarks and some reasonably simple proofs that it can simulate any electronic circuit. We only understand how it works by the smallest increment of its intelligence improvement and through induction understanding that that should lead to further improvements in its intelligence.
> awesomeness
They should learn to think for themselves about the whole picture, not learn about 'awesomeness'.
Which module introduces students to some of the ethical and environmental aspects of using GenAI?
This is excellent intuition to have for how LLMs work, as well as understanding the implications of living in an LLM-powered world. Just as useful to adults as children.
I wonder how utterly broken the next generation will be, having to grow up with all this stuff.
Us millennials didn’t do so great with native access to the internet, zoomers got wrecked growing up with social media. . . It’s rough.
This is great and has lots of practical stuff.
Some of the takeaways feel over-reliant on implementation details that don’t capture intent. E.g. something like “the LLM is just trying to predict the next word” sort of has the explanatory power of “your computer works because it’s just using binary”—like yeah, sure, practically speaking yes—but that’s just the most efficient way to lay out their respective architectures and could conceivably be changed from under you.
I wonder if something like neural style transfer would work as a prelude. It helps me. Don’t know how you’d introduce it, but with NST you have two objectives—content loss and style loss—and can see pretty quickly visually how the model balances between the two and where it fails.
The bigger picture here is that people came up with a bunch of desirable properties they wanted to see and a way to automatically fulfill some of them some of the time by looking at lots of examples, and it’s why you get a text box that can write like Shakespeare but can’t tell you whether your grandma will be okay after she went to the hospital an hour ago.
Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
> Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.
Are there newer changes that are actually doing prediction of tokens out of order or such, or are this a case of immense internal model state tracking but still using it to drive the prediction of a next token, one at a time?
(Wrapped in a variety of tooling/prompts/meta-prompts to further shape what sorts of paragraphs are produced compared to ye olden days of the gpt3 chat completion api.)
This is really well put together, it should probably include more about ethics, hidden bias, etc.