← Back to context

Comment by jorvi

16 days ago

My mother did, for Christmas. It was a goose that ended up being raw in a lot of places.

I then pointed out this same inconsistency to her, and that she shouldn't put stock in what Gemini says. Testing it myself, it would give results between 47c-57c. And sometimes it would just trip out and give the health-approved temperature, which is 74c (!).

Edit: just tested it again and it still happens. But inconsistency isn't a surprise for anyone who actually knows how LLMs work.

https://imgur.com/a/qYmznHa

I just asked gemini 3 5 times: `what temperature I should take a waterfowl out of the oven`

and received generic advice every single time it gave nearly identical charts. 165F was in every response. LLMs are unpredictable yes. But I am more skeptical it would give incorrect answers (raw goose) rather than your mother preparing the fowl wrong.

Cooking correctly is a skill, just as prompting is. Ask 10 people how to cook fowl and their answers will mimic the LLM.

> But inconsistency isn't a surprise for anyone who actually knows how LLMs work

Exactly. These people saying they've gotten good results for the same question aren't countering your argument. All they're doing is proving that sometimes it can output good results. But a tool that's randomly right or wrong is not a very useful one. You can't trust any of its output unless you can validate it. And for a lot of the questions people ask of it, if you have to validate it, there was no reason to use the LLM in the first place.