Comment by tracerbulletx

6 months ago

A lot of sites don't have enough traffic to get statistical significance with this in a reasonable amount of time and it's almost always testing a feature more complicated than button color where you aren't going to have more than the control and variant.

7 comments

tracerbulletx

kridsdale1 6 months ago

I’ve only implemented A/B/C tests at Facebook and Google, with hundreds of millions of DAU on the surfaces in question, and three groups is still often enough to dilute the measurement in question below stat-sig.

ryan-duve 6 months ago

> A lot of sites don't have enough traffic to get statistical significance with this in a reasonable amount of time

What's nice about AB testing is the decision can be made on point estimates, provided the two choices don't have different operational "costs". You don't need to know that A is better than B, you just need to pick one and the point estimate gives the best answer with the available data.

I don't know of a way to determine whether A is better than B with statistical significance without letting the experiment run, in practice, for way too long.

wiml 6 months ago

If the effect size x site traffic is so small it's statistically insignificant, why are you doing all this work in the first place? Just choose the option that makes the PHB happy and move on.

(But, it's more likely that you don't know if there's a significant effect size)

koliber 6 months ago
The PHB wanted A/B testing! True story. I've spent two months convincing them that it made no sense with the volume of conversion events we had.
- wussboy 6 months ago
  
  Another option, "I'm already doing A/B testing, trust me."

douglee650 6 months ago

Yes wondering what the confidence intervals are.