paperKB
coga / coga-kb
Help
Sign in

Chunk #0 — SNP genotyping

Source
Integrating common and rare genetic variation in diverse human populations.
Embedded
yes

Text

Genotype data were obtained with the Affymetrix Human SNP array 6.0 (interrogating 1,852,600 genomic sites) and the Illumina Human1M-single beadchip (1,199,187 genomic sites), initially applied to 1,486 and 1,284 samples, respectively. Following genotype calling6,11 and initial filtering of low-quality and incomplete data, 909,622 variant SNPs from 1,326 samples (Affymetrix) and 1,055,111 sites from 1,211 samples (Illumina) remained. Data from the two platforms were merged; genotype concordance was 99.5% (across 335,014 overlapping SNPs) at a call rate of 99.8%. Further filters were applied to this merged data set on the basis of population-specific call rates, deviation from Hardy–Weinberg equilibrium and the expected Mendelian inheritance patterns (Supplementary Methods). The consensus genotype set contains 1,440,616 SNPs that are polymorphic in 1,184 individuals from 11 populations. Analysis shows a small but statistically significant bias against rare (MAF = 0.05–0.5%) allele calls (observed in both platforms), consistent with previous reports (Supplementary Information). The data were then phased (Supplementary Information).