Comment by BugsJustFindMe

4 months ago

Be careful. Output formatting doesn't prove what you think it does. Unless you work inside google and can inspect the computation happening, you do not have any way to know whether it's showing actual execution or only a simulacrum of execution. I've seen LLMs do exactly that and show output that is completely different from what the code actually returns.

8 comments

BugsJustFindMe

sunaookami 4 months ago

There is being critical of something and then there is being a conspiracy theorist. Code Execution is a well-known feature of Gemini, ChatGPT, etc. and it's always shown in special blocks and it runs inside a sandbox.

colonCapitalDee 4 months ago

You can literally click "Show Code"

BugsJustFindMe 4 months ago
Yes. "Show Code", not "Show CPU cycles". There's a difference. Writing code is not the same as running code. It looks to you like it ran the code. But you have no proof that it did. I've seen many times LLM systems from companies that claimed that their LLMs would run code and return the output claiming that they ran some code and returned the output but the output was not what the shown code actually produced when run.
- Sophira 4 months ago
  
  In my experience, models do not tend to write their own HTML output. They tend to output something like Markdown, or a modified version of it, and they wouldn't be able to write their own HTML that the browser would parse as such.
  
  1 reply →
- xVedun 4 months ago
  
  Maybe the only way to be sure is to have it generate (not stable diffuse) an image with the value in there.
  
  2 replies →