Comment by Der_Einzige

1 month ago

These results would be radically different if you allowed manipulation of the models settings, i.e. temperature, top_p, etc. I really hate taking point wise approximations of LLMs outputs and concluding their behavior based on this.

Models behavior should be given the astrik that "results only apply for current quantization, current settings, current hardware (i.e. A100 where it was tested), etc".

Raise temperature to 2 and use a fancy sampler like min_p and I guarantee you these results will be dramatically different.

1 comment

Der_Einzige

da_chicken 1 month ago

That's like asking to judge the chef by what you imagine the meal could taste like rather than what's on the table.

I don't care what might have been. I care about what's for dinner.