← Back to context

Comment by arvindveluvali

7 hours ago

This is a really good point, but we don't think hallucinations pose a significant risk to us. You can think of Fresco like a really good scribe; we're not generating new information, just consolidating the information that the superintendent has already verbally flagged as important.

This is the wrong response. It doesn't matter whether you've asked it to summarize or to produce new information, hallucinations are always a question of when, not if. LLMs don't have a "summarize mode", their mode of operation is always the same.

A better response would have been "we run all responses through a second agent who validates that no content was added that wasn't in the original source". To say that you simply don't believe hallucinations apply to you tells me that you haven't spent enough time with this technology to be selling something to safety-critical industries.

This seems odd. If your scribe can lie in complex and sometimes hard to detect ways, how do you not see some form of risk? What happens when (not if) your scribe misses something and real world damages ensue as a result? Are you expecting your users to cross check every report? And if so, what’s the benefit of your product?

  • We rely on multimodal input: the voiceover from the superintendent, as well as the video input. The two essentially cross check one another, so we think the likelihood of lies or hallucinations is incredibly low.

    Superintendents usually still check and, if needed, edit/enrich Fresco’s notes. Editing is way faster/easier than generating notes net new, so even in the extreme scenario where a supe needs to edit every single note, they’re still saving ~90% of the time it’d otherwise have taken to generate those notes and compile them into the right format.