Comment by conductrics

13 years ago

A GA is a zeroth order optimization method. A Bandit is a type of decision problem. So, bandit is a single state RL problem were one is trying to make decisions in an environment in order to min regret. GA is a general optimization approach when there is no gradient or second order info about the problem to use. Take a look at XCS classifiers for an approach that can solve bandit type problems, but uses GAs to estimate the mailings between features and rewards.