Comment by potatolicious

6 months ago

> "I think they're trying to say AI generated code-pushers are often getting fuzzy on speccing out the behavior guarantees of their own software."

I agree, and I think that's the root of the years-long argument of whether programmers are "real" engineers, where "real engineering" implies a level of rigor about the existence of and adherence to specifications.

My take on this is though that this unseriousness really has little to with AI and entirely to do with the longstanding culture of software generally. In fact I'd go as far as to say that pre-LLM ML was better about this than the rest of the industry at-large.

I've had the good fortune to be working in this realm since before LLMs became the buzzword - most ML teams had well-quantified model behaviors! They knew their precision and recall! You kind of had to, because it was very hard to get models to do what you wanted, plus companies involved in this space generally cared about outcomes.

Then we got LLMs, when you can superficially produce really impressive results easily, and the dominance of vibes over results. I can't stand it either, and mostly am just waiting for most of these things to go bust so we can go back to probabilistic systems where we give a shit about quantification.

1 comment

potatolicious

whatevertrevor 6 months ago

I agree with that.

I think part of the issue with the lack of "real" quantification in the results of LLMs is that the output and problem domain is so ill-defined. With standard neural nets (and other kinds of ML) classifiers, regression models and reinforcement models all had very narrow, domain specific problems they were solving. It was a no-brainer to measure directly how your vision classifier performs against a radiologist in determining whether an image corresponds to lung cancer.

Now we've opened up the output to a wide variety of open-ended domains: natural languages, programming languages, images and videos. Since the output domain is inherently subjective, it's hard to get a good handle on their usefulness, let alone getting people to agree on that. Hence the never-ending discourse around them.