Chunk #7 — Results — A Novel Two-Step Task with Transition Probability Reversals

Source: The Anterior Cingulate Cortex Predicts Future States to Mediate Model-Based Action Selection.
Embedded: yes

Text

Transition probability reversals have two desirable consequences. First, if both reward and action-state transition probabilities change independently over time, it is possible to dissociate state prediction and reward prediction in neural activity. Second, reversals in the transition probabilities prevent subjects from using habit-like strategies consisting of mappings from the second-step state in which rewards have recently been obtained to specific actions at the first step. This can in principle generate behavior that looks very similar to model-based control, despite not using forward planning (Akam et al., 2015). Transition probability reversals break the long-run predictive relationship between where rewards are obtained and which first-step action is correct, preventing these strategies while still permitting model-based RL. We directly compared versions of the task with fixed and changing action-state transition probabilities (Figure S1) and found that subject’s behavior was radically different in each, suggesting that they recruit different strategies.