Comment by monkeydust
4 days ago
Why isn't this spoken more about? Not a developer but work very closely with many - they are all on a spectrum from zero interest in this technology to actively using it to write code (correlates inversely seniority from my sample set) - very little talk on using it for reviews/checks - perhaps that needs to be done passively on commit.
The main issue with LLMs is that they can't "judge" contributions correctly. Their review is very nitpicky on things that don't matter and often misses big issues that a human familiar with the codebase would recognise. It's almost just noise at the end.
That's why everyone is moving to the agent thing. Even if the LLM makes a bunch of mistakes, you still have a human doing the decision making and get some determinism.
My work has been adding more and more AI review bots. It's been like 0 for 10 for the feedback the AI has given me. Just wasting my time. I see where it's coming from, it's not utter nonsense, but it just doesn't understand the nuance or why something is logically correct.
That said, there have been some reports where the AIs have predicted what later became outages when they were ignored.
So... I don't know. Is it worth wading through 10 bad reviews of 1 good one prevents a bad bug? Maybe. I do hope the ratio gets better though
So far, it seems pretty bad at code review. You'd get more mileage by configuring a linter.