Chunk #5 — Results

Source: Extremely low-coverage sequencing and imputation increases power for genome-wide association studies.
Embedded: yes

Text

robustness of our approach (Supplementary Note, Supplementary Figure 7). To assess the power of detecting true positives, in addition to simulated phenotypes, we also carried out a case-control study comparing HIV-1 controllers (61) and progressors (23) from the IHCS data set (Online Methods). The higher off-target coverage (0.5x) in the IHCS data leads to an average of r2=0.82 to the genotype calls at the 398,098 SNPs assayed by arrays in the IHCS data14. A similar λGC (genomic control)21 value of 1.05 for imputed data as compared to 1.04 for typed data was observed (Supplementary Note, Supplementary Figure 4). We specifically looked at SNPs previously reported to be significantly associated with HIV-1 controller status14 and observed similar association statistics and effect sizes as compared to SNP arrays, both for the entire set of 47 previously associated SNPs (Supplementary Note, Supplementary Table 5) and for the subset of 10 SNPs with nominal P<0.05 in the SNP array data (Table 1). The association statistics obtained using extremely low-coverage sequencing did not exhibit the 9% drop that might have been expected given r2=0.91 imputation accuracy at these SNPs (ratio between the average −log 10 p-values at imputed versus typed data of 1.04), but this