Comment by 59nadir

6 days ago

Honestly, the one through line that I've seen with regards to the success of AI in programming is that it'll work very well for trivial, mass-produced bullshit and anyone who was already doing that for work will feel like it can do their job (and it probably can) almost entirely.

I don't really doubt that AI can put together your Nth Rails backend that does nothing of note pretty solidly, but I know it can't even write a basic, functioning tokenizer + parser in a very simple, imperative language (Odin) for a Clojure-like language. It couldn't even (when given the source for a tokenizer) write the parser that uses the tokenizer either.

These are very basic things that I would expect juniors with some basic guidance to accomplish, but even when using Cursor + Claude Sonnet 3.5 (this was 2-3 months ago, I had seen recommendations about exactly the combinations of tools I was attempting to use, so I don't really buy the argument that somehow it was the choice of tools that was wrong) it fell apart and even started adding functions it already added before. At some point I seeded it with properly written parser functions to give it examples of what it needs to accomplish, but it kept basically failing completely when having access to literally all the code it needed.

I can't even imagine how badly it'd fail to handle the actual complicated parts of my work where you have to think across 3 different context boundaries (simulation -> platform/graphics API -> shader) in order to do things.

> I don't really doubt that AI can put together your Nth Rails backend that does nothing of note pretty solidly

Ha. Funny you should say that....recently I've been using AI to green-field a new Rails project, and my experience with it has been incredibly mixed, to say the least.

The best agents can, more or less, crank out working code after a few iterations, but it's brittle, and riddled with bad decisions. This week I had to go through multiple prompt iterations trying to keep Claude 3.7 from putting tons of redundant logic in a completely unnecessary handler block for ActiveRecord::RecordNotFound exceptions -- literally 80% of the action logic was in the exception handler, for an exception that isn't really exceptional. It was like working with someone who just learned about exceptions, and was hell-bent on using them for everything. If I wasn't paying attention the code may have worked, I suppose, but it would have fallen apart quickly into an incomprehensible mess.

The places where the AI really shines are in boilerplate situations -- it's great for writing an initial test suite, or for just cranking out a half-working feature. It's also useful for rubber ducking, and more than occasionally breaks me out of debugging dead ends, or system misconfiguration issues. That's valuable.

In my more cynical moments, I start to wonder if the people who are most eager to push these things are 0-3 years out of coding bootcamps, and completely overwhelmed by the boilerplate of 10+ years of bad front-end coding practices. For these folks, I can easily see how a coding robot might be a lifeline, and it's probably closer to the sweet spot for the current AI SOTA, where literally everything you could ever want to do has been done and documented somewhere on the web.

  • > The best agents can, more or less, crank out working code after a few iterations, but it's brittle, and riddled with bad decisions.

    You're right, I'm very likely overestimating the output even though I'm on the skeptical end of it.

    > The places where the AI really shines are in boilerplate situations -- it's great for writing an initial test suite

    I definitely do agree with this; and I would add that at that point you can really make do with tab-completion and not a full agent workflow. I used this successfully even back in 2021-2022 with Copilot.

    > In my more cynical moments, I start to wonder if the people who are most eager to push these things are 0-3 years out of coding bootcamps

    I think it's all in all a mix of a lot of factors: I think spending your time mostly on well-trodden ground will definitely give you a sense of GenAI being more useful than if you aren't, and I think most newer programmers spend most of their time on exactly that. They may also be given tasks even at work that are more boilerplatey in nature; they're definitely not making deep design decisions as a rule, or having anything to do with holistic architectural decisions.

    • > I definitely do agree with this; and I would add that at that point you can really make do with tab-completion and not a full agent workflow. I used this successfully even back in 2021-2022 with Copilot.

      Sure. The thing that "agents" add to this -- and I think it's actually really valuable -- is the ability to run their own output and fix the bugs.

      Yesterday I pointed an agent at a controller, asked it to write a test suite (with some core requirements of what exactly to test), and then I reviewed the content of the tests for sanity. Then I pointed another agent at the output of that code, and told it to run the tests and fix the bugs.

      It got stuck once or twice, requiring me to say "don't do that" and/or change a few lines of code, but overall I had a much more comprehensive test suite than I ever would have written, in about 15 minutes of drinking coffee.