Comment by datsci_est_2015

9 days ago

Anecdotally, I’m finding that, at least in the Spark ecosystem, AI-generated ideas and code are far from optimal. Some of this comes from misinterpreting the (sometimes poor) documentation, and some of it comes from, probably, there not being as many open source examples as CRUD apps, which AI “influentists” (to borrow from TFA) appear to often be hyping up.

This matters a lot to us because the difference in performance of our workflows can be the difference in $10/day in costs and $1000/day in costs.

Just like TFA stresses, it’s the expertise in the team that pushes back against poor AI-generated ideas and code that is keeping our business within reach of cash flow positive. ~”Surely this isn’t the right way to do this?”

Most text worth paying for (code, contracts, research) requires:

- accountability

- reliability

- validation

- security

- liability

Humans can reliably produce text with all of these features. LLMs can reliably produce text with none of them.

If it doesn't have all of these, it could still be worth paying for if it's novel and entertaining. IMO, LLMs can't really do that either.

  • Let's not put humans on too much of a pedestal, there are plenty of us who are not that reliable either. That's why we have tests, linting, types and various other validation systems. Incidentally, LLMs can utilize these as well.

    • Humans are unreliable in predictable ways. This makes review relatively painless since you know what to look for, and you can skim through the boilerplate and be pretty confident that it's right and isn't redundant/insecure, etc.

      LLMs can use linters and type checkers, but getting past them often times leads it down a path of mayhem and destruction, doing pretty dumb things to get them to pass.