← Back to context

Comment by jedberg

6 days ago

If I were a professor, I would make my homework start the same -- here is a problem to solve.

But instead of asking for just working code, I would create a small wrapper for a popular AI. I would insist that the student use my wrapper to create the code. They must instruct the AI how to fix any non-working code until it works. Then they have to tell my wrapper to submit the code to my annotator. Then they have to annotate every line of code as to why it is there and what it is doing.

Why my wrapper? So that you can prevent them from asking it to generate the comments, and so that you know that they had to formulate the prompts themselves.

They will still be forced to understand the code.

Then double the number of problems, because with the AI they should be 2x as productive. :)

For introductory problems, the kind we use to get students to understand a concept for the first time, the AI would likely (nearly) nail it on the first try. They wouldn't have to fix any non-working code. And annotating the code likely doesn't serve the same pedagogical purpose as writing it yourself.

Students emerge from lectures with a bunch of vague, partly contradictory, partly incorrect ideas in their head. They generally aren't aware of this and think the lecture "made sense." Then they start the homework and find they must translate those vague ideas into extremely precise code so the computer can do it -- forcing them to realize they do not understand, and forcing them to make the vague understanding concrete.

If they ask an AI to write the code for them, they don't do that. Annotating has some value, but it does not give them the experience of seeing their vague understanding run headlong into reality.

I'd expect the result to be more like what happens when you show demonstrations to students in physics classes. The demonstration is supposed to illustrate some physics concept, but studies measuring whether that improves student understanding have found no effect: https://doi.org/10.1119/1.1707018

What works is asking students to make a prediction of the demonstration's results first, then show them. Then they realize whether their understanding is right or wrong, and can ask questions to correct it.

Post-hoc rationalizing an LLM's code is like post-hoc rationalizing a physics demo. It does not test the students' internal understanding in the same way as writing the code, or predicting the results of a demo.

> They will still be forced to understand the code.

But understanding is just one part of the learning process, isn't it? I assume everybody has had this feeling: the professor explains maths on the blackboard, and the student follows. The students "understands" all the steps: they make sense, they don't feel like asking a question right now. Then the professor gives them an exercise slightly different and asks to do the same, and the students are completely lost.

Learning is a loop: you need to accept it, get it in your memory (learn stuff by heart, be it just the vocabulary to express the concepts), understand it, then try to do it yourself. Realise that you missed many things in the process, and start at the beginning: learn new things by heart, understand more, try it again.

  • That loop is still there. They have to get the AI to write the right code.

    And beyond that, do they really need to understand how it works? I never learned how to calculate logarithms by hand, but I know what they are for and I know when to punch the button on the calculator.

    I'll never be a top tier mathematician, but that's not my goal. My goal is to calculate things that require logs.

    If they can get the AI to make working code and explain why it works, do they need to know more than that, unless they want to be top in their field?

    • > If they can get the AI to make working code and explain why it works, do they need to know more than that, unless they want to be top in their field?

      Making working code is the easy part. Making maintainable code is a completely different story.

      And again, being able to explain why something works requires superficial knowledge. This is precisely why bugs pass through code reviews: it's hard to spot a bug by reading code that looks like it should work.