Comment by IgorPartola
13 years ago
Or, get complicated and vary the epsilon based on the recent change in conversions.
For example, record the conversions from the exploration path vs each individual exploitation attempt. Then, adjust epsilon accordingly, so that it is within a certain percentage of one of your best choices.
No comments yet
Contribute on Hacker News ↗