Comment by simlevesque

4 months ago

It did run python code when I asked for a random number: https://gemini.google.com/share/dcd6658d7cc9

Then I said: "don't run code, just pick one" and it replied "I'll go with 7."

12 comments

simlevesque

But .. how do you know? It says it wrote code, but it could just be text and markdown and template. It could just be predicting what it looks like to run code.

Mine also gave me 42 before I specified 1-10.

Does it always start with 42 thinking its funny?

wasabi991011 4 months ago

This was a pretty easy hypothesis to test: I asked Gemini to generate 1000000 base-64 random characters (which is 20x more characters than it's output token limit).
It wrote code and outputted a file of length 1000000 and with 6 bits of entropy.
You can probably ask for a longer stringand do a better statistical test if it isn't convincing enough for you, but I'm pretty convinced.
Transcript: https://g.co/gemini/share/1eae0a4bb3db
simlevesque 4 months ago
Click on the link I provided and you'll know why I know. It's not markdown, it shows the code that was ran and the output.
- BugsJustFindMe 4 months ago
  
  Be careful. Output formatting doesn't prove what you think it does. Unless you work inside google and can inspect the computation happening, you do not have any way to know whether it's showing actual execution or only a simulacrum of execution. I've seen LLMs do exactly that and show output that is completely different from what the code actually returns.
  
  8 replies →