Comment by jldugger 4 hours ago Would 1.0 have fixed the wide variance in scoring? 1 comment jldugger Reply nok22kon 1 hour ago temperature is the wrong toolthe variance is caused by the bad evaluation promptif you ask "what is the capital of Paris" you'll always get Paris, with any (non-extreme) temperature
nok22kon 1 hour ago temperature is the wrong toolthe variance is caused by the bad evaluation promptif you ask "what is the capital of Paris" you'll always get Paris, with any (non-extreme) temperature
temperature is the wrong tool
the variance is caused by the bad evaluation prompt
if you ask "what is the capital of Paris" you'll always get Paris, with any (non-extreme) temperature