Comment by teej

6 months ago

Fair dice rolls is not an objective that cloud LLMs are optimized for. You should assume that LLMs cannot perform this task.

This is a problem when people naively use "give an answer on a scale of 1-10" in their prompts. LLMs are biased towards particular numbers (like humans!) and cannot linearly map an answer to a scale.

It's extremely concerning when teams do this in a context like medicine. Asking an LLM "how severe is this condition" on a numeric scale is fraudulent and dangerous.

9 comments

teej

low_tech_love 6 months ago

This week I was on a meeting for a rather important scientific project at the university, and I asked the other participants “can we somehow reliably cluster this data to try to detect groups of similar outcomes?” to which a colleague promptly responded “oh yeah, chatGPT can do that easily”.

stanislavb 6 months ago
I guess, he's right - it will be easy and relatively accurate. Relatively/seemingly.
- low_tech_love 6 months ago
  
  So that’s it then? We replace every well-understood, objective algorithm with well-hidden, fake, superficial surrogate answers from an AI?
  
  5 replies →

Terr_ 6 months ago

It'll also give you different results based on logically-irrelevant numbers that might appear elsewhere in the collaborative fiction document.