Comment by ruszki

1 month ago

One of my friends did a job for a government. He generated the code for it with some LLM. It provided a result which was about what he thought should be. He - or anybody - never checked the code whether it really calculated what it should have. “It did what [he] needed it to do”. Now the said government started to make decisions based on a result which proved by nobody. In other words, lottery.

What you mentioned doesn’t mean anything until there is no hard proof that it really works. I understand that it seems to you that it works, but I’ve seen enough to know that that means absolutely nothing.

1 comment

ruszki

kilobaud 1 month ago

Thanks, I can relate to the parent poster, and this is a really profound comment for me. I appreciate the way you framed this. I’ve felt compelled to fact check my own LLM outputs but I can’t possibly keep up with the quantity. And it’s tempting (but seems irrational) to hand the results to a different LLM. My struggle is remembering there needs to be input/query/calculation/logic validation (without getting distracted by all the other shiny new tokens in the result)