A total of 145 COPD subjects had sputum samples with gene expression data available; two arrays failed quality control. Of the remaining 143 subjects, 131 had corresponding genomewide SNP data and phenotype data. The Affymetrix HG-U133 Plus 2 array contains 54,675 probe sets. After filtering out 17,420 probe sets which were not annotated with a specific gene symbol in the hgu133plus2.db R/Bioconductor database or which mapped to the X or Y chromosomes, 37,255 probe sets remained. Microarray preprocessing used the robust multiarray average method and quantile normalization [28], implemented in Bioconductor. QC of microarrays was performed using the Bioconductor package affyQCReport; QC results are available in the Data S1 and in Figure S1. QC of genomewide SNP data in ECLIPSE has been reported [11]. SNPs with minor allele frequency <0.05 in the 131 ECLIPSE cases were additionally excluded.