The genome-wide CNV distribution is very different between AA and NHW subjects, so we estimated all CNV components separately for the two racial groups, and we performed stratified genome-wide CNV association analyses for each group. Our main focus in this analysis was to test for association between polymorphic CNVs and quantitative phenotypes describing smoking behaviors. Calling of CNV components is described in Younkin et al. and Begum et al. [25, 26]. Log10-transformed values of pack-years were used for the heavily (right) skewed data (S2 Fig), and we categorized average number of cigarette smoked per day into 7 categories, and considered that as a continuous phenotype. Linear regression models were used to assess the relationship between a quantitative smoking phenotype and polymorphic CNVs while adjusting for covariates such as gender and age. The average admixture score [27], which summarizes the average European ancestry among AA subjects, was also considered in the analysis of all AA subjects.