Comment by ertdfgcvb

6 months ago

I don't follow. In this case would sampling 50/50 always give better/unbiased results on the experiment?

4 comments

ertdfgcvb

Sampling 50/50 will always give you the best chance of picking the best ultimate 'winner' in a fixed time horizon, at the cost of only sampling the winning variant 50% of the time. That's true if the reward rates are fixed or not. But some changes in reward rates will also cause MAB aggregate statistics to skew in a way that they shouldn't for a 50/50 split yeah.

zeroCalories 6 months ago
What do you think of using the epsilon-first approach then? We could explore for that fixed time horizon, then start choosing greedy after that. I feel like the only downside is that adding new arms becomes more complicated.
- hinkley 6 months ago
  
  What percent of companies using A/B testing do you think know what the Texas Sharpshooter is and how to identify it, let alone what epsilon is or what it means?

lern_too_spel 6 months ago

Yes.