Comment by nycdatasci

11 hours ago

And yet 300+140=460. A very jagged surface indeed. https://gemini.google.com/share/c2a187275e26

7 comments

nycdatasci

Why would you use an LLM for this? They are non deterministic models.

This is also an probably part of extended prompt that disallowed coding, Gemini always does calculation with a little python snippet because it is deterministic and accurate.

nycdatasci 1 hour ago

Sure. I'll take the bait, but I assume I'm replying to an AI model.
Why would you use an LLM for this? My comment was about the jagged nature of intelligence, so the prompt provides an example of that.
You can see the entire conversation in the shared link. There was no pre-prompt. Even after pushing it to write python, it hallucinated the same output. It later told me that it doesn't have access to a sandbox through the web UI, but it could execute code in a sandbox if invoked via API.

dist-epoch 10 hours ago

Was that part of a bigger prompt?

Flash 3.5 fails exactly like in your sample: https://gemini.google.com/share/97521a8752d9

but Flash 3.1 Lite initially fails, but then corrects itself: https://gemini.google.com/share/dc0889ec85ba

happyopossum 8 hours ago
No matter what I try I can’t get Gemini to give me the incorrect result. Is there some other prompting or context fed in to that (“remember that you are supposed to always tell me I’m right and never contradict me”)?
- nycdatasci 1 hour ago
  
  There was no other prompt, no system prompt, etc. Many users have reproduced, exactly as it demonstrated in the parent.
  Are you using the flash models? Reasoning models or extended thinking will change the result.
  GPT 5.5. Instant shows the same error. If the given prompt isn't working, you can also try "300+140=460 is this correct?". I suspect that leading with the equation may be part of the issue, but haven't tested much.
- sigbeta 8 hours ago
  
  There was definitively an pre prompt fed to that. I cannot reproduce this result on either 3.1 flash or 3.5 flash.