← Back to context

Comment by jagged-chisel

3 days ago

> Reproducible would be great

Wouldn’t it be great? I’m still waiting for reproducibility from LLMs.

4 comments

jagged-chisel

Reply

Can you reproduce irreproducibility?

Give me a question which the LLM answers vastly differently on runs.

I keep hearing how it's dumb and wrong but no one ever shares the chat or prompt

jagged-chisel 2 days ago

Yes. https://news.ycombinator.com/item?id=48420769
uxhacker 3 days ago
Try this with ChatGPT or GROK or Claude
How many days of the week contain the letter d?
The answer I get with ChatGPT, and Grok is 3 and 6 with Claude.
- jagged-chisel 2 days ago
  
  I just used ChatGPT only, twice. Web interface in a Firefox private window, and in a Chrome incognito window. I asked them both the identical question "How many names of the days of the week contain the letter D?"
  In Firefox I got 6. In Chrome I got 7. LLMs are not even self-consistent.
  I have the screenshots if anyone cares.