Cortical substrates for exploratory decisions in humans

Abstract
Humans are remarkably curious, and that is useful in helping us to learn about new environments and possibilities. But curiosity killed the cat, they say, and it also carries with it substantial potential risks and costs for us. Statisticians, engineers and economists have long considered ways of balancing the costs and benefits of exploration. Tests involving a gambling task and an fMRI brain scanner now show that humans appear to obey similar principles when considering their options. The players had to balance the desire to select the richest option based on accumulated experience against the desire to choose a less familiar option that might have a larger payoff. The frontopolar cortex, a brain area known to be involved in cognitive control, was preferentially active during exploratory decisions. The results suggest a neurobiological account of human exploration and point to a new area for behavioural and neural investigations. Use of a gambling task and a functional magnetic resonance imaging (fMRI) scanner shows that human subjects' choices can be characterized by a computationally well regarded strategy for addressing the explore/exploit dilemma. Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this ‘exploration–exploitation’ dilemma1, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning2 (RL) theory, indicates that a dopaminergic3,4, striatal5,6,7,8,9 and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical1 and ethological10 perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.