Comment by binary0010

1 day ago

Maybe try making a simple randomize script to swap the three latest models. And see if you can tell which ones are meaningfully different without knowing which ones are flipped on or off?

3 comments

binary0010

osigurdson 21 hours ago

I find the quality ebbs and flows even on the same model. My guess it is something to do with GPU availability but only guessing.

atq2119 21 hours ago
Unless you're systematically repeating the exact same task, the most parsimonious explanation is that you're seeing natural variation based on different tasks, random sampling of tokens, etc.
- osigurdson 16 hours ago
  
  I don't think this explains the phenomenon as is more temporal in nature - not prompt to prompt. I'm sure the AI labs gracefully degrade to simpler models when resources are low - why wouldn't they?