Alternative Title
Finding All Epsilon-Good Arms in Stochastic Bandits
Abstract
The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ∈-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ∈-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ∈-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ∈-good candidates. Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2Mratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs.
Recommended Citation
B. Mason et al., "Finding All ∈-Good Arms in Stochastic Bandits," Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation, Dec 2020.
Meeting Name
34th Conference on Neural Information Processing Systems, NeurIPS 2020 (2020: Dec. 6-12, Vancouver, Canada)
Department(s)
Computer Science
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2020 Neural Information Processing Systems Foundation, All rights reserved.
Publication Date
12 Dec 2020
Comments
The work presented in this paper was supported by ARO grant W911NF-15-1-0479. Additionally, this work was partially supported by the MADLab AF Center of Excellence FA9550-18-1-0166.