← Back to context Comment by gchamonlive 11 hours ago What you saying the OP fabricated/hallucinated the evidence? 6 comments gchamonlive Reply Retr0id 10 hours ago I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata. gchamonlive 10 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 10 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
Retr0id 10 hours ago I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata. gchamonlive 10 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 10 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
gchamonlive 10 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 10 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
Retr0id 10 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata.
How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago?
Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section)
3 replies →