Comment by ashdksnndck
2 months ago
I think this is in the training data since they use commit data from repos, but I imagine code deletions are rarer than they should be in the real data as well.
2 months ago
I think this is in the training data since they use commit data from repos, but I imagine code deletions are rarer than they should be in the real data as well.
deleting and code cleanup is perhaps more an expression of seniority, and personal preferences. Maybe there should be the same kind style transfer with code that you see with graphical generative AI, "rewrite this code path in the style of Donald Knuth"
I imagine there would be value in not just throwing all of GitHub commits in as training data, but also rating the quality.