Comment by epistasis

1 year ago

I keep on trying them, but if they are useful they are useful for only a small fraction of engineers at the moment. I'm not sure if this is due to the nature of the work, or the nature of the user.

I have heard "top" engineers at various places say it makes them 2x faster, or whatever, but I would like to see this assessed by timed testing, as is sometimes done for evaluating software engineering.

Copilot may let me type less, but I have not seen the wall clock effects, which is a very hard thing to measure (time perception is very unreliable).

https://news.ycombinator.com/item?id=43071381

You can see this example where I timed myself to deployment using AI tools to rewrite a show HN project in half an hour. The code is open source.

My comment was posted 2 hours after show HN when I saw it on front page so you know I didn’t lose track of time I spent.

  • The vast majority of software work is not greenfielding a PoC or reimplementing an existing, small, well-specced project. We’ve had OpenAPI client generators for years after all.

    The majority of software work is maintaining large, existing products: adding features, fixing bugs, improving performance, etc., or building new software in problem domains that aren’t so well-defined.

    • This is my experience too.

      I think it also really accelerates learning of a new language or framework, when that language or framework is really well documented on the web. For novel programming frameworks, obviously it's a bit more challenging to get help from an LLM.

      One of more recent attempts at using LLM code assist was to try to fix a bug in a Swift SSH Agent's connection handling that was causing hangs. I know zero Swift, much less the networking frameworks. So I pumped the output of `tree` on the git repo into the LLM, asked for which file likely handled connections, and it found it right away. That's probably 15 minutes saved. Before putting in the file I asked for likely reasons for deadhangs, got that list, then put in the Swift file that handled connections, and it pointed to what the likely problem was. That's probably 1 hour+ of reading documentation to try to figure out what the code was doing wrong with the networking framework, assuming the LLM was not hallucinating. And that "not hallucinating" probability is high enough in my experience that I spend >50% of my time trying to verify I'm not getting bullshitted.

      The LLM proposed a fix (~10-20 minute savings), but even as somebody who doesn't use Swift it seemed like >99% chance that it had just introduced a bunch of race conditions in the data structures it used to track connection status. So I asked about it, and it said "Oh yeah of course how could I forget" and then significantly complicated the solution with something that I thought looked like it probably worked. But was the LLM just being obsequious or was it correct the first time? So hard to tell...

      So in about 20 minutes I probably accomplished in a language I didn't know, in a code base I didn't know, about 2 hours+ of learning.

      But if I knew the language, it would have saved me very little time, and may have cost me some time.

      2 replies →

    • In my opinion, AI only really helps you (a lot) if you are bottlenecked by the actual code-writin. I have not been in a such a position since... I dont even know. Maybe In my 20s, 15 or so years ago? Even if AI wrote my code 100x faster it would not appreciably change my working days.

      If it could test and verify things though... ideally physically since Im in embedded and pulling SD-cards etc is a thing.

      1 reply →

  • I agree it's impressive and stuff, but I wouldn't consider a JS POC as a serious project. I have never done that in my whole life and would rather see results from a 10 years old application with a million lines of code of C++. That's would be realistic. What you did is refactoring a pet project and I don't know why we're wasting $billions for that.