Comment by bigyabai

9 months ago

> Out of interest, have you built an agent?

I wrote one with Google's BERT in 2020 because I was hair-on-fire ecstatic over the idea. Cargo-culted an inference library and hooked it into a Slack bot to post the changelogs. You can guess how that turned out, but yes, at one point I shared the dream. Nothing I've seen has motivated me to try again, the pace of Claude and ChatGPT releases haven't inspired me to try again.

My worry is that you're getting too hyped up when there's not really any serious evidence the issues can be solved. It would be cool if it did, but again, refer to the flying car - great dream, but avgas isn't getting any cheaper. Nor pilots insurance.

> Like, why don't we have Renovatebot for that class of KTLO?

Liability? If you don't keep the lights on, the business is critically impaired. AI agents doesn't use the right address when sending the power bill - cute error in testing, catastrophic error in real life. How do we, as engineers, realistically stop an AI from doing that? How can we introduce heuristic variability without opening avenues for catastrophic, unfixable failure? You might be throwing developers under the bus by advocating for them too strongly here. "Pushbutton idempotency" and "robot that cleans my room" is a square peg trying to fit in a round hole.

3 comments

bigyabai

ghuntley 9 months ago

With respect a lot has changed since 2020. I appreciate your replies. Your points are valid. There is a lot to be solved. There’s some stuff that should not be automated but there’s definitely some stuff that should.

marcus_holmes 8 months ago

Can you answer the question, though, please?
How do we stop them making catastrophic mistakes?
To my mind this is core to the whole thing. Yes, we could make 1000 robots that clean up codebases overnight, but until we can answer the above question, we should definitely, absolutely, not do that.