← Back to context Comment by gchamonlive 6 hours ago What you saying the OP fabricated/hallucinated the evidence? 6 comments gchamonlive Reply Retr0id 6 hours ago I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata. gchamonlive 6 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 6 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
Retr0id 6 hours ago I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata. gchamonlive 6 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 6 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
gchamonlive 6 hours ago How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago? Retr0id 6 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
Retr0id 6 hours ago Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section) 3 replies →
I'm just saying it's epistemically unrigorous to the point of being equivalent to anecdata.
How should one conduct such a rigourously reproducible experiment when LLMs by nature aren't deterministic and when you don't have access to the model you are comparing to from months ago?
Something like this: https://marginlab.ai/trackers/claude-code/ (see methodology section)
3 replies →