Comment by chickensong

3 hours ago

Yes. The models may have started from indiscriminate scraping, but people are undoubtedly working on refining the training data. Combined with the overall model capabilities, I suspect code quality will continue to go up.

What you're suggesting is a negative flywheel where quality spirals down, but I'm hoping it becomes a positive loop and the quality floor goes up. We had plenty of slop before LLMs, and not all LLM output is slop. Time will tell, but I think LLMs will continue to improve their coding abilities and push overall quality higher.