Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic

Open Access

Abstract

We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the firstorder relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.

Keywords

All Related Versions

Version 1, RePEc (Unconfirmed version)

This publication has 12 references indexed in Scilit:

The Complexity of Optimal Queuing Network Control
Mathematics of Operations Research, 1999
Scheduling a Make-To-Stock Queue: Index Policies and Hedging Points
Operations Research, 1996
Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems
Mathematics of Operations Research, 1996
Optimization of Multiclass Queueing Networks: Polyhedral and Nonlinear Characterizations of Achievable Performance
The Annals of Applied Probability, 1994
Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control
Operations Research, 1992
Addendum to ‘On an index policy for restless bandits'
Advances in Applied Probability, 1991
Cones of Matrices and Set-Functions and 0–1 Optimization
SIAM Journal on Optimization, 1991
On an index policy for restless bandits
Journal of Applied Probability, 1990
Characterization and Optimization of Achievable Performance in General Queueing Systems
Operations Research, 1988
A Characterization of Waiting Time Performance Realizable by Single-Server Queues
Operations Research, 1980

Cited by 154 articles