Model-based control, by contrast, dynamically computes optimal actions by forward planning, a process that is computationally demanding but allows for flexible, outcome-specific behavioral repertoires (Daw et al., 2005, Dayan and Niv, 2008 and Otto et al., 2013; but see Gershman et al., 2012). In this MDV3100 supplier study, our goal was to manipulate the relative balance between these two systems in human participants. We focused on the dorsolateral prefrontal
cortex (dlPFC) as a substrate for model-based processes based on previous evidence for its role in the construction and use of associative models (Gläscher et al., 2010, Wunderlich et al., 2012a and Xue et al., 2012) and the coding of hypothetical outcomes (Abe and Lee, 2011). Work on nonhuman primates also implicates the dlPFC as a site for convergence of reward and contextual information (Lee and Seo, 2007), while lesions of rat prelimbic region (which some argue is equivalent to primate dlPFC [Fuster, 2008; but see Preuss, 1995 and Uylings KPT-330 solubility dmso et al., 2003]) abolishes flexible decision making (Killcross and Coutureau, 2003). Therefore, while the literature suggests a crucial role for this region in model-based control to date there is a lack of causal evidence to support this hypothesis. Here we used a transient lesion model, as engendered by theta burst transcranial
magnetic stimulation (TBS), to provide evidence for a necessary role of dlPFC in model-based behavior.
We recruited 25 human participants (mean age [SD]: 24.2 [4.0] years; 15 females) to perform a task in which behavior can be explained by a mixture of model-free and model-based control (Daw et al., 2011). All participants were tested on three separate sessions (3 to 16 days apart) after MRI-guided TBS to the right dlPFC, left dlPFC, or vertex. TBS is known to inhibit cortical excitability for at least 20 min (Huang et al., 2005). We thus predicted that participants would show reduced model-based control after dlPFC compared to vertex TBS. Given existing evidence of functional asymmetries between left and right dlPFC, e.g., in reciprocal fairness (Knoch et al., 2006) and working memory (Mull and Seyal, 2001), we also hypothesized that the effects Org 27569 of TBS would differ between these sites. We used a task that enables quantification of model-based and model-free control over choices (Daw et al., 2011). Participants were required to make two choices on every trial to arrive at a rewarded or a nonrewarded outcome (Figure 1A). Choices at the first stage of the task probabilistically determine which pair of options becomes available to the participant at the second stage. Crucially, for each first-stage action, one pair of second-stage options is more likely to occur (a “common transition”).