Comment by groby_b
5 days ago
That's a statistically valid approach. Technically correct, the best kind of correct.
Meanwhile, if your users get presented a different button whenever they come by, because the MAB is still pursuing its hill climbing, they'll rightfully accuse you of having extremely crappy UX. (And, sure, you can have MAB with user stickiness, but now you do need to talk about sampling bias)
And MAB hill climb doesn't work at all if you want to measure the long-term reward of a variation. You have no idea if the orange button has long-term retention impact. There are sure situations where you'd like to know.
Yes, it's a neat technique to have in your repertoire, but like any given technique, it's not the answer "every time".
A/B testing has the same problem unless you figure out how to treat the same user the same way each time. But yeah I generally don't give much credence to an assertion like this that isn't based on a real-world experience. It's not even like "this algo might help," it's "will beat A/B testing every time."