We excluded from subsequent analysis genes that were likely not expressed in LCLs (any probe with a detection P-value > 0.01 in all samples). Median probe intensity estimates per gene were then used as the expression estimates for 14 367 genes. We further excluded 3085 genes because they were not associated with an hg18 RefSeq record or were not autosomal. This resulted in a final data set of expression estimated for 11 282 genes (Supplementary data, Table S1). In addition to the processed gene expression estimates available in Supplementary data, Table S1, raw and normalized expression data are available at the NCBI GEO database with accession number GSE30697 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30697).