Comment by robjellinghaus

7 days ago

In my experience with Claude Code and Sonnet, it is absolutely possible to have architectural and design-oriented conversations about the work, at an entirely different and higher level than using a (formerly) high-level programming language. I have been able to learn new systems and frameworks far faster with Claude than with any previous system I have used. It definitely does require close attention to detect mistakes it does not realize it is making, but that is where the skill comes in. I find it being right 80% of the time and wrong 20% of the time to be a hugely acceptable tradeoff, when it allows me to go radically faster because it can do that 80% much quicker than I could. Especially when it comes to learning new code bases and exploring new repos I have cloned -- it can read code superhumanly quickly and explain it to me in depth.

It is certainly a hugely different style of interaction, but it helps to think of it as a conversation, or more precisely, a series of individual small targeted specific conversations, each aimed at researching a specific issue or solving a specific problem.

Indeed, I successfully use LLMs for research, and they're an improvement because old-school search isn't very reliable either.

But as to the 80-20 tradeoff on other tasks, the problem isn't that the tool is wrong 20% of the time, but that it's not trustworthy 100% of the time. I have to check the work. Maybe that's still valuable, but just how valuable that is depends on many factors, some of which are very domain-dependent and others are completely subjective. We're talking about replacing one style with another that is much better in some respects and much worse in others. If, on the whole, it was better in almost all cases, that would be one thing (and make the investment safer), but reports suggest it isn't.

I've yet to try an LLM to learn a new codebase, and I have no doubt it will help a lot, but while that is undoubtedly a very expensive task, it's also not a very frequent one. It could maybe save me a week per year, amortised. That's not nothing (and I will certainly give it a try next time I need to learn a new codebase), but it's also not a game-changer.