Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration
Top Cited Papers
- 29 March 2007
- journal article
- conference paper
- Published by The Royal Society in Philosophical Transactions Of The Royal Society B-Biological Sciences
- Vol. 362 (1481), 933-942
- https://doi.org/10.1098/rstb.2007.2098
Abstract
Many large and small decisions we make in our daily lives - which ice cream to choose, what research projects to pursue, which partner to marry - require an exploration of alternatives before committing to and exploiting the benefits of a particular choice. Furthermore, many decisions require re-evaluation, and further exploration of alternatives, in the face of changing needs or circumstances. That is, often our decisions depend on a higher level choice: whether to exploit well known but possibly suboptimal alternatives or to explore risky but potentially more profitable ones. How adaptive agents choose between exploitation and exploration remains an important and open question that has received relatively limited attention in the behavioural and brain sciences. The choice could depend on a number of factors, including the familiarity of the environment, how quickly the environment is likely to change and the relative value of exploiting known sources of reward versus the cost of reducing uncertainty through exploration. There is no known generally optimal solution to the exploration versus exploitation problem, and a solution to the general case may indeed not be possible. However, there have been formal analyses of the optimal policy under constrained circumstances. There have also been specific suggestions of how humans and animals may respond to this problem under particular experimental conditions as well as proposals about the brain mechanisms involved. Here, we provide a brief review of this work, discuss how exploration and exploitation may be mediated in the brain and highlight some promising future directions for research.Keywords
This publication has 47 references indexed in Scilit:
- A tunable algorithm for collective decision-makingProceedings of the National Academy of Sciences, 2006
- Cortical substrates for exploratory decisions in humansNature, 2006
- Neurons in the orbitofrontal cortex encode economic valueNature, 2006
- Matching Behavior and the Representation of Value in the Parietal CortexScience, 2004
- Creativity and Bipolar Diathesis: Common Behavioural and Cognitive ComponentsCognition and Emotion, 1999
- Taking time seriously: A theory of socioemotional selectivity.American Psychologist, 1999
- Reinforcement Learning: An IntroductionIEEE Transactions on Neural Networks, 1998
- Anterior Cingulate Cortex, Error Detection, and the Online Monitoring of PerformanceScience, 1998
- A Neural Substrate of Prediction and RewardScience, 1997
- Specious reward: A behavioral theory of impulsiveness and impulse control.Psychological Bulletin, 1975