Comment by LPisGood

1 year ago

This objection is mentioned specifically in the post.

You can add a forgetting factor for older results.

3 comments

LPisGood

This seems like a fudge factor though. Some things are changed bc you act on them! (e.g. recommendation systems that are biased towards more popular content). So having dynamic groups makes the data harder to analyze

LPisGood 1 year ago

A standard formulation of MAB problem assumes that acting will impact the rewards, and this forgetting factor approach is one which allows for that and still attempts to find the currently most exploitable lever.

lern_too_spel 1 year ago

That's a different problem. In jbentley1's scenario, A could be better, but this algorithm will choose B.