Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic
Open Access
- 1 February 2000
- journal article
- Published by Institute for Operations Research and the Management Sciences (INFORMS) in Operations Research
- Vol. 48 (1), 80-90
- https://doi.org/10.1287/opre.48.1.80.12444
Abstract
We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the firstorder relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.Keywords
All Related Versions
This publication has 12 references indexed in Scilit:
- The Complexity of Optimal Queuing Network ControlMathematics of Operations Research, 1999
- Scheduling a Make-To-Stock Queue: Index Policies and Hedging PointsOperations Research, 1996
- Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable SystemsMathematics of Operations Research, 1996
- Optimization of Multiclass Queueing Networks: Polyhedral and Nonlinear Characterizations of Achievable PerformanceThe Annals of Applied Probability, 1994
- Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling ControlOperations Research, 1992
- Addendum to ‘On an index policy for restless bandits'Advances in Applied Probability, 1991
- Cones of Matrices and Set-Functions and 0–1 OptimizationSIAM Journal on Optimization, 1991
- On an index policy for restless banditsJournal of Applied Probability, 1990
- Characterization and Optimization of Achievable Performance in General Queueing SystemsOperations Research, 1988
- A Characterization of Waiting Time Performance Realizable by Single-Server QueuesOperations Research, 1980