defined as >6 mean average deviations from the age-appropriate linear fit were removed. The total number of probes remaining was 30,176. After background correction on the linear scale, log2 ratios (sample/reference) were normalized across mean log2 florescent intensities using loess correction21. Missing data in the gene expression data matrix were imputed at this stage to enable both SVA and PCA. After normalization, log2 ratios were further adjusted to reduce the impact of known and unknown sources of systematic noise on gene expression measures using SVA22. Two surrogate variables were generated and used to adjust log2 ratios in all subsequent linear models. Correlation between the naively created surrogate variables and known sources of noise were evident: SV1 + RIN: r = 0.37, P = 4.7 × 10−10; SV2 + ArrayBatch: r = 0.73; P < 2 × 10−16. All of these microarray data analyses were conducted using custom code and tools from the Bioconductor project (http://www.bioconductor.org/) in the R statistical language (http://www.r-project.org/). Validation of microarray expression patterns was performed by Taqman qPCR (Supplementary Table 8).