Alternative Title

Finding All Epsilon-Good Arms in Stochastic Bandits

Abstract

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ∈-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ∈-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ∈-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ∈-good candidates. Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2Mratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs.

Meeting Name

34th Conference on Neural Information Processing Systems, NeurIPS 2020 (2020: Dec. 6-12, Vancouver, Canada)

Department(s)

Computer Science

Comments

The work presented in this paper was supported by ARO grant W911NF-15-1-0479. Additionally, this work was partially supported by the MADLab AF Center of Excellence FA9550-18-1-0166.

Document Type

Article - Conference proceedings

Document Version

Final Version

File Type

text

Language(s)

English

Rights

© 2020 Neural Information Processing Systems Foundation, All rights reserved.

Publication Date

12 Dec 2020

Share

 
COinS