The decoding analysis predicted location in the tasks state-action space from neuronal activity. Ten locations were defined by the time relative to trial events and the trial choice, second-step and outcome (Figure 4D). The analysis used trial aligned neuronal activity and 250ms duration time windows: pre-choice (starting 300ms before subjects choice), post-choice (centered between choice and outcome) and post-outcome (starting 100ms after trial outcome). Activity was averaged across the time window to give a single value for each neuron on a given visit to a location. The analysis combined activity from multiple sessions by taking a randomly selected 10 visits to each location for each session and concatenating activity vectors from like locations across sessions to give 10 population activity vectors for each location. Location was predicted from neuronal activity using multinomial logistic regression with L2 regularisation. Decoding accuracy was assessed using stratified k-fold cross validation with 10-folds, such that each training dataset contained 9 visits to each location and each test dataset the remaining visit to each location. The analysis included the 9 sessions from 3 animals with at