Comment by dilDDoS

4 days ago

I'm happy to finally see this take. I've been feeling pretty left out with everyone singing the praises of AI-assisted editors while I struggle to understand the hype. I've tried a few and it's never felt like an improvement to my workflow. At least for my team, the actual writing of code has never been the problem or bottleneck. Getting code reviewed by someone else in a timely manner has been a problem though, so we're considering AI code reviews to at least take some burden out of the process.

AI code reviews are the worst place to introduce AI, in my experience. They can find a few things quickly, but they can also send people down unnecessary paths or be easily persuaded by comments or even the slightest pushback from someone. They're fast to cave in and agree with any input.

It can also encourage laziness: If the AI reviewer didn't spot anything, it's easier to justify skimming the commit. Everyone says they won't do it, but it happens.

For anything AI related, having manual human review as the final step is key.

  • Agreed.

    LLM’s are fundamentally text generators, not verifiers.

    They might spot some typos and stylistic discrepancies based on their corpus, but they do not reason. It’s just not what the basic building blocks of the architecture do.

    In my experience you need to do a lot of coaxing and setting up guardrails to keep them even roughly on track. (And maybe the LLM companies will build this into the products they sell, but it’s demonstrably not there today)

    • > LLM’s are fundamentally text generators, not verifiers.

      In reality they work quite well for text and numeric (via tools) analysis, too. I've found them to be powerful tools for "linting" a codebase against adequately documented standards and architectural guidance, especially when given the use of type checkers, static analysis tools, etc.

      2 replies →

  • That's a fantastic counterpoint. I've found AI reviewers to be useful on a first pass, at a small-pieces level. But I hear your opinion!

  • I find the summary that copilot generates is more useful than the review comments most of the time. That said, I have seen it make some good catches. It’s a matter of expectations: the AI is not going to have hurt feelings if you reject all its suggestions, so I feel even more free to reject it feedback with the briefest of dismissals.

  • What about something like this?

    Link to the ticket. Hopefully your team cares enough to write good tickets.

    So if the problem is defined well in the ticket, do the code changed actually address it?

    For example for a bug fix. It can check the tests and see if the PR is testing the conditions that caused the bug. It can check the code changed to see if it fits the requirements.

    I think the goal with AI for creative stuff should be to make things more efficient, not replace necessarily. Whoever code reviews can get up to speed fast. I’ve been on teams where people would code review a section of the code they aren’t familiar with too much.

    In this case if it saves them 30 minutes then great!

  • I agree and disagree. I think it's important to make it very visually clear that it is not really a PR, but rather an advanced style checker. I think they can be very useful for assessing more rote/repetitive standards that are a bit beyond what standard linters/analysis can provide. Things like institutional standards, lessons learned, etc. But if it uses the normal PR pipeline rather than the checker pipeline, it gives the false impression that it is a PR, which is not.

IMO, the AI bits are the least interesting parts of Zed. I hardly use them. For me, Zed is a blazing fast, lightweight editor with a large community supporting plugins and themes and all that. It's not exactly Sublime Text, but to me it's the nearest spiritual successor while being fully GPL'ed Free Software.

I don't mind the AI stuff. It's been nice when I used it, but I have a different workflow for those things right now. But all the stuff besides AI? It's freaking great.

I found the OP comment amusing because Emacs with a Jetbrains IDE when I need it is exactly my setup. The only thing I've found AI to be consistently good for is spitting out boring boilerplate so I can do the fun parts myself.

I always hear this "writing code isn't the bottleneck" used when talking about AI, as if there are chosen few engineers who only work on completely new and abstract domains that require a PhD and 20 years of experience that an LLM can not fathom.

Yes, you're right, AI cannot be a senior engineer with you. It can take a lot of the grunt work away though, which is still part of the job for many devs at all skill levels. Or it's useful for technologies you're not as well versed in. Or simply an inertia breaker if you're not feeling very motivated for getting to work.

Find what it's good for in your workflows and try it for that.

  • I feel like everyone praising AI is a webdev with extremely predictable problems that are almost entirely boilerplate.

    I've tried throwing LLMs at every part of the work I do and it's been entirely useless at everything beyond explaining new libraries or being a search engine. Any time it tries to write any code at all it's been entirely useless.

    But then I see so many praising all it can do and how much work they get done with their agents and I'm just left confused.

    • Yeah, the more boilerplate your code needs, the better AI works, and the more it saves you time by wasting less on boilerplate.

      AI tooling my experience:

      - React/similar webdev where I "need" 1000 lines of boilerplate to do what jquery did in half a line 10 years ago: Perfect

      - AbstractEnterpriseJavaFactorySingletonFactoryClassBuilder: Very helpful

      - Powershell monstrosities where I "need" 1000 lines of Verb-Nouning to do what bash does in three lines: If you feed it a template that makes it stop hallucinating nonexisting Verb-Nouners, perfect

      - Abstract algorithmic problems in any language: Eh, okay

      - All the `foo,err=…;if err…` boilerplate in Golang: Decent

      - Actually writing well-optimized business logic in any of those contexts: Forget about it

      Since I spend 95% of my time writing tight business logic, it's mostly useless.

Highlighting code and having cursor show the recommended changes and make them for me with one click is just a time saver over me copying and pasting back and forth to an external chat window. I don’t find the autocomplete particularly useful, but the inbuilt chat is a useful feature honestly.

I'm the opposite. I held out this view for a long, long time. About two months ago, I gave Zed's agentic sidebar a try.

I'm blown away.

I'm a very senior engineer. I have extremely high standards. I know a lot of technologies top to bottom. And I have immediately found it insanely helpful.

There are a few hugely valuable use-cases for me. The first is writing tests. Agentic AI right now is shockingly good at figuring out what your code should be doing and writing tests that test the behavior, all the verbose and annoying edge cases, and even find bugs in your implementation. It's goddamn near magic. That's not to say they're perfect, sometimes they do get confused and assume your implementation is correct when the test doesn't pass. Sometimes they do misunderstand. But the overall improvement for me has been enormous. They also generally write good tests. Refactoring never breaks the tests they've written unless an actually-visible behavior change has happened.

Second is trying to figure out the answer to really thorny problems. I'm extremely good at doing this, but agentic AI has made me faster. It can prototype approaches that I want to try faster than I can and we can see if the approach works extremely quickly. I might not use the code it wrote, but the ability to rapidly give four or five alternatives a go versus the one or two I would personally have time for is massively helpful. I've even had them find approaches I never would have considered that ended up being my clear favorite. They're not always better than me at choosing which one to go with (I often ask for their summarized recommendations), but the sheer speed in which they get them done is a godsend.

Finding the source of tricky bugs is one more case that they excel in. I can do this work too, but again, they're faster. They'll write multiple tests with debugging output that leads to the answer in barely more time than it takes to just run the tests. A bug that might take me an hour to track down can take them five minutes. Even for a really hard one, I can set them on the task while I go make coffee or take the dog for a walk. They'll figure it out while I'm gone.

Lastly, when I have some spare time, I love asking them what areas of a code base could use some love and what are the biggest reward-to-effort ratio wins. They are great at finding those places and helping me constantly make things just a little bit better, one place at a time.

Overall, it's like having an extremely eager and prolific junior assistant with an encyclopedic brain. You have to give them guidance, you have to take some of their work with a grain of salt, but used correctly they're insanely productive. And as a bonus, unlike a real human, you don't ever have to feel guilty about throwing away their work if it doesn't make the grade.

  • > Agentic AI right now is shockingly good at figuring out what your code should be doing and writing tests that test the behavior, all the verbose and annoying edge cases,

    That's a red flag for me. Having a lot of tests usually means that your domain is fully known so now you can specify it fully with tests. But in a lot of setting, the domain is a bunch of business rules that product decides on the fly. So you need to be pragmatic and only write tests against valuable workflows. Or find yourself changing a line and have 100+ tests breaking.

    • If you can write tests fast enough, you can specify those business rules on the fly. The ideal case is that tests always reflect current business rules. Usually that may be infeasible because of the speed at which those rules change, but I’ve had a similar experience of AI just getting tests right, and even better, getting tests verifiably right because the tests are so easy to read through myself. That makes it way easier to change tests rapidly.

      This also is ignoring that ideally business logic is implemented as a combination of smaller, stabler components that can be independently unit tested.

      1 reply →

    • This is a red flag for me. Any given user-facing software project with changing requirements is still built on top of relatively stable, consistent lower layers. You might change the business rules on top of those layers, but you need generally reasonable and stable internal APIs.

      Not having this is very indicative of a spaghetti soup architecture. Hard pass.

  • What languages and contexts are you getting these good results for?

AI is solid for kicking off learning a language or framework you've never touched before.

But in my day to day I'm just writing pure Go, highly concurrent and performance-sensitive distributed systems, and AI is just so wrong on everything that actually matters that I have stopped using it.

  • But so is a good book. And it costs way less. Even though searching may be quicker, having a good digest of a feature is worth the half hour I can spend browsing a chapter. It’s directly picking an expert brains. Then you take notes, compare what you found online and the updated documentation and soon you develop a real understanding of the language/tool abstraction.

    • In an ideal world, yeah. But most software instructional docs and books are hot garbage, out of date, incorrect, incomplete, and far too shallow.

      2 replies →

  • I’m using Go to build a high performance data migration pipeline for a big migration we’re about to do. I haven’t touched Go in about 10 years, so AI was helpful getting started.

    But now that I’ve been using it for a while it’s absolutely terrible with anything that deals with concurrency. It’s so bad that I’ve stopped using it for any code generation and going to completely disable autocomplete.

  • AI has stale knowledge I won't use it for learning, especially because it's biased towards low quality JS repos on which has been trained on

    • A good example would be Prometheus, particularly PromQL for which the docs are ridiculously bare, but there is a ton of material and stackoverflow answers scattered al over the internet.

zed was just a fast and simple replacement for Atom (R.I.P) or vscode. Then they put AI on top when that showed up. I don't care for it, and appreciate a project like this to return the program to its core.