← Back to context

Comment by atq2119

1 day ago

Unless you're systematically repeating the exact same task, the most parsimonious explanation is that you're seeing natural variation based on different tasks, random sampling of tokens, etc.

I don't think this explains the phenomenon as is more temporal in nature - not prompt to prompt. I'm sure the AI labs gracefully degrade to simpler models when resources are low - why wouldn't they?