![[Write to Learn] How Exploring Can Increase Long-Term Reward? An Illustration Using Epsilon-Greedy Method.](/content/images/size/w720/2024/03/48187dc35a15c033986c222bb7028a66.jpg)
[Write to Learn] How Exploring Can Increase Long-Term Reward? An Illustration Using Epsilon-Greedy Method.
The example presented in Reinforcement Learning: An Introduction by Sutton and Barto. A possible value added by this article is to provide a more detailed explanation and code snippets to replicate the example.
The 10-armed Bandit Example
For readers not familiar with the k-armed bandit problem, you are repeatedly faced