Abstract
This paper studies the problem of adaptively sampling from K distributions (arms) in order to identify the largest gap between any two adjacent means. We call this the MaxGap-bandit problem. This problem arises naturally in approximate ranking, noisy sorting, outlier detection, and top-arm identification in bandits. The key novelty of the MaxGap bandit problem is that it aims to adaptively determine the natural partitioning of the distributions into a subset with larger means and a subset with smaller means, where the split is determined by the largest gap rather than a pre-specified rank or threshold. Estimating an arm's gap requires sampling its neighboring arms in addition to itself, and this dependence results in a novel hardness parameter that characterizes the sample complexity of the problem. We propose elimination and UCB-style algorithms and show that they are minimax optimal. Our experiments show that the UCB-style algorithms require 6-8x fewer samples than non-adaptive sampling to achieve the same error.
Recommended Citation
S. Katariya et al., "MaxGap Bandit: Adaptive Algorithms for Approximate Ranking," Advances in Neural Information Processing Systems, vol. 32, Neural Information Processing Systems Foundation, Dec 2019.
Meeting Name
33rd Conference on Neural Information Processing Systems, NeurIPS 2019 (2019: Dec. 8-14, Vancouver, Canada)
Department(s)
Computer Science
International Standard Serial Number (ISSN)
1049-5258
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2019 Neural Information Processing Systems Foundation, All rights reserved.
Publication Date
14 Dec 2019
Comments
This work was partially supported by AFOSR/AFRL grants FA8750-17-2-0262 and FA9550-18-1-0166.