So agree with that - but on the other hand surely the number of developers matters here? For example, if instead of 16 developers the study consisted of a single developer completing all 246 tasks with or without AI, and comparing the observed times to complete, I think most people would question the reproducibility and relevancy of the study?
It matters in the sense that it is unclear whether the findings generalise to other people. Which is a problem that a lot of studies, even with more participants, have because they may not have a diverse enough set of participants.
But in terms of pure statistical validity, I don't think it matters.
Whilst my recent experience possibly agrees with the findings, I came here to moan about the methods. Whether it's 16 or 246, that's still a miserably small sample size.
If you read through the methodology, including how they paid the participants $150 / hr, for 20-40 hours work per participant, you can probably hazard a guess why they didn't scale up the size of the study by 1000x.
So agree with that - but on the other hand surely the number of developers matters here? For example, if instead of 16 developers the study consisted of a single developer completing all 246 tasks with or without AI, and comparing the observed times to complete, I think most people would question the reproducibility and relevancy of the study?
It matters in the sense that it is unclear whether the findings generalise to other people. Which is a problem that a lot of studies, even with more participants, have because they may not have a diverse enough set of participants.
But in terms of pure statistical validity, I don't think it matters.
Whilst my recent experience possibly agrees with the findings, I came here to moan about the methods. Whether it's 16 or 246, that's still a miserably small sample size.
Okay, so why not 246,000 issues?
If you read through the methodology, including how they paid the participants $150 / hr, for 20-40 hours work per participant, you can probably hazard a guess why they didn't scale up the size of the study by 1000x.