Comment by criley2

6 days ago

I asked ChatGPT o3 if lidar could damage phone sensors and it said yes https://chatgpt.com/share/683ee007-7338-800e-a6a4-cebc293c46...

I also asked Gemini 2.5 pro preview and it said yes. https://g.co/gemini/share/0aeded9b8220

I find it interesting to always test for myself when someone suggests to me that a "LLM" failed at a task.

4 comments

criley2

illiac786 6 days ago

I should have been more specific, but you missed my point I believe.

I tested this at the time on Claude 3.7 sonnet, which have an earlier cut off date and I just tested again with this prompt: “Can the lidar of a self driving car damage a phone camera sensor?” and the answer is still wrong in my test.

I believe the issue is the training cut off date, that’s my point, LLM seem smart but they have limits and when asked about something discovered after training cut off date, they will sometimes confidently be wrong.

criley2 6 days ago
I didn't miss your point, rather I wanted you to realize some deeper points I was trying to make
- Not all LLM are the same, and not identifying your tool is problematic because "LLM's can't do a thing" is very different than "The particular model I used failed at this thing". I demonstrated that by showing that many LLMs get the answer right. It puts the onus of correctness entirely on the category of technology, and not the tool used or the skill of the tool user.
- Training data cutoffs are only one part of the equation: Tool use by LLM's allows them to search the internet and run arbitrary code (amongst many other things).
In both of my cases, the training data did not include the results either. Both used a tool call to search the internet for data.
Not realizing that modern AI tools are more than an LLM with training data, but rather have tool calling, full internet access, and can access and reason about a wide variety of up to date data sources demonstrates a fundamental misunderstanding about modern AI tools.
Having said that:
Claude Sonnet 4.0 says "yes": https://claude.ai/share/001e16f8-20ea-4941-a181-48311252bca0
Personally, I don't use Claude for this kind of thing because while it's proven to be a very good at being a coding assistant and interacting with my IDE in an "agentic" manner, it's clearly not designed to be a deep research assistant that broadly searches the internet and other data sources to provide accurate and up to date information. (This would mean that ai/model selection is a skill issue and getting good results from AI tools is a skill, which is borne out by the fact that I get the right answer every time I try, and you can't get the right answer once).
- illiac786 5 days ago
  
  Still not getting it I think.
  My point is: LLMs sound very plausible and very confident when they are wrong.
  That’s it. And I was just offering a trick to help remembering this, to keep checking their output – nothing else.
  
  1 reply →