Comment by tracerbulletx
6 days ago
A lot of sites don't have enough traffic to get statistical significance with this in a reasonable amount of time and it's almost always testing a feature more complicated than button color where you aren't going to have more than the control and variant.
I’ve only implemented A/B/C tests at Facebook and Google, with hundreds of millions of DAU on the surfaces in question, and three groups is still often enough to dilute the measurement in question below stat-sig.
> A lot of sites don't have enough traffic to get statistical significance with this in a reasonable amount of time
What's nice about AB testing is the decision can be made on point estimates, provided the two choices don't have different operational "costs". You don't need to know that A is better than B, you just need to pick one and the point estimate gives the best answer with the available data.
I don't know of a way to determine whether A is better than B with statistical significance without letting the experiment run, in practice, for way too long.
If the effect size x site traffic is so small it's statistically insignificant, why are you doing all this work in the first place? Just choose the option that makes the PHB happy and move on.
(But, it's more likely that you don't know if there's a significant effect size)
The PHB wanted A/B testing! True story. I've spent two months convincing them that it made no sense with the volume of conversion events we had.
Another option, "I'm already doing A/B testing, trust me."
Yes wondering what the confidence intervals are.