Comment by stuaxo

5 hours ago

The quality issue doesn't seem unique to Python.

The versioning issue I've seen across libraries that version change in many languages.

I don't tend to hit Python 2 issues using LLMs with it, but I do hit library things (e.g. Pydantic likes to make changes between libraries - or loads of the libraries used a lot by AI companies).

I’ve found recent Claude to be much better in this regard. I think a lot rests on the quality of the harness and the work behind the scenes done to RAG up to date docs or search for docs proactively rather than guessing.

I also don’t have issues with quality of Python generated. It takes a bit of nudging to use list comps and generators rather than imperative forms but it tends to mimic code already in context. So if the codebase is ok, it does do better.