Comment by keeda

2 hours ago

I hadn't looked at that study you selected, but yeah the methodology conflicts with the abstract (Also it low-key seems to be an ad for "Flexi 2.0.") It does seems to be a shady paper, with a small N and in a journal of questionable repute.

That said, there are 80+ other studies listed in the meta-study, which is pretty frank about its limitations. (Note the snippets about positive biases in the conclusion.) It is going more for quantity over quality and is transparent about the statistical findings of each one (or lack thereof; see the count of "Not reported"s.) All these references have a myriad of results, but across the spectrum of well-designed studies at reputable venues to the other end, they follow the same themes, so I don't think this can be dismissed that easily.

But if you want, here's more research (some of which I linked in a sibling comment https://maxmynter.substack.com/p/learn-to-code-with-llms-i-r...

1 comment

keeda

runarberg 14 minutes ago

This was the only study from the meta analysis that I read, and I picked it because it made the strongest claim out of all of them.

This is in the opening of the results section in the meta-analysis:

> In the final screening phase, a rigorous full-text analysis evaluated the methodological robustness and empirical validity of the remaining studies. [...] The final corpus comprised 88 studies that demonstrated robust empirical evidence for LLM applications in educational contexts.

The inclusion of the study I read does not give me confidence that this statement is true. And the fact that they reach their conclusion by simply tallying up the positive vs. negative studies makes me conclude that this meta-analysis is practically useless. They do admit this in the conclusion (which is probably why it passed peer review [assuming the peer reviewer didn’t read the same citation as me as I am 100% certain they would have asked for it to be excluded]). But that pretty much just leaves us with nothing. We are exactly where we started. No evidence that LLMs help students beyond traditional methods.

Now I am not gonna read that Anthropic study. It reminds me of Cigarette companies finding the health benefits of cigarettes. That leaves that excellent 3-study review. In their first study they found LLM has negative effects on students (in line with the first link you showed me). In the second study they found no effect. And in the third study they found mixed (nuanced) effect where using LLMs as tutor helped students in one aspect but had negative effects on others. This is by far the best study you have presented me but it still does not change my opinion. There is little evidence that LLMs (even when used as a tutor) help people learn better traditional methods.

What makes me even more against this sentiment is this quote from the conclusion of the 3-study review paper:

> Our results suggest that students prefer to use LLMs to substitute rather than complement learning activities.

So on their own, students are more likely to use LLMs in a way which is harmful to their learning. I would expect similar behavior of junior developers.