Comment by chrismarlow9

2 months ago

100% agreed. use the non-deterministic thing that is right 90% of the time to generate a deterministic thing that is right 100% of the time. one of the key things I add to my prompts is:

- Please consult me when you encounter any ambiguous edge cases

Attaching the AI to production to directly do things with API calls is bad. For me the only use case where the app should do any AI stuff is with reading/categorizing/etc. Basically replacing the "R" in old CRUD apps. If you want to use that same new AI based "R" endpoint to auto fill forms for the "C", "U", and "D" based on a prompt that's cool, but it should never mutate anything for a customer before a human reviews it. Basically CRUD apps are still CRUD apps (and this will always be true), they just have the benefit of having a very intelligent "R" endpoint that can auto complete forms for customers (or your internal tooling/Jenkins pipelines/etc), or suggest (but never invoke) an action.

3 comments

chrismarlow9

TZubiri 2 months ago

> Please consult me when you encounter any ambiguous edge cases

Why not check the logprobs of the output and take action when the prob of the first and second most likely token is too similar? (or below a certain threshold?

SpicyLemonZest 2 months ago

I think you're getting abstraction layers mixed up, prediction uncertainty and logical uncertainty aren't the same. In a reasoning model, it's entirely possible that there's only one likely continuation and it says something like "This edge case is ambiguous, but what the user most likely meant is X".
jatora 2 months ago

because this is manual? are you an llm?