Comment by hn-acct

4 months ago

How do you quantify or decide an acceptable failure rate for llm output?

2 comments

hn-acct

Same way as any other production model in ML. Or any field that requires quality control. Really, this is not fundamentally different in conceptual approach than implementing any other technology or area of knowledge which is a near verbatim definition of engineering.

avemuri 4 months ago

Depends on the failure mode and application. But a first approximation is the same way you would for a human output. E.g. process engineering for a support chatbot has many of the same principles as process engineering for a human staffed call center.