Comment by nonethewiser

1 month ago

You're looking at this exactly the right way.

8 comments

nonethewiser

What you're describing is not just true, it's precise.

charles_f 1 month ago
Good — you’re asking the right question
- vbezhenar 1 month ago
  
  Spaces around dash. Human detected.
  
  2 replies →
Kevcmk 1 month ago

Dying

do LLMs arrive at these replies organically? Is it baked into the corpus and naturally emerges? Or are these artifacts of the internal prompting of these companies?

GuB-42 1 month ago

Reinforcement learning.
People like being told they are right, and when a response contains that formulation, on average, given the choice, people will pick it more often than a response that doesn't, and the LLM will adapt.