Multi-armed Bandit Problem与增强学习的联系
选自《Reinforcement Learning: An Introduction》, version 2, 2016, Chapter2https://webdocs.cs.ualberta.ca/~sutton/book/bookdraft2016sep.pdf引言中是这样引出Chapter2的:One of the challenges that arise in reinforc...