Comment by antasvara

15 days ago

This is what a lot of people miss. We have thousands of years of understanding the kinds of mistakes that humans make; we only have months to years of experience with the mistakes that LLM's and other AI's make.

This means that most of our verification and testing processes won't inherently catch AI errors because they're designed to catch human errors. Things like "check to see if the two sides of these transactions sum to 0" are fine for human typos, but they won't catch a fake (yet accurately entered) transaction.

It's similar to a language barrier. You don't realize how much you rely on context clues until you spend 3 days of emails trying to communicate a complex topic to someone in their second language.

> we only have months to years of experience with the mistakes that LLM's and other AI's make.

The mistakes are also very much model dependent. That you have build a system which improves the accuracy of one models output give you no confidence that it will work on even the next generation of the same model.