Comment by Quarrel

17 hours ago

I really like this as a suggestion, but getting opensource code that isn't in the LLMs training data is a challenge.

Then, with each model having a different training epoch, you end up with no useful comparison, to decide if new models are improving the situation. I don't doubt they are, just not sure this is a way to show it.

1 comment

Quarrel

amelius 17 hours ago

Yes, but perhaps the impact of being trained on code on being able to find bugs in code is not so large. You could do a bunch of experiments to find out. And this would be interesting in itself.