paperKB
coga / coga-kb
Help
Sign in

Chunk #23 — RESULTS — DATA ANALYSIS OF POPRES

Source
Discovering genetic ancestry using spectral graph theory.
Embedded
yes

Text

Of the 4,079 samples with labeled ethnicity and genotypes that passed the POPRES quality control (their QC2 procedure) we removed 38 close relatives and 280 samples with greater than 5% missing genotypes. The remaining data included 346 African Americans, 49 E. Asians, 329 Asian-Indians, 82 Mexicans, and 2,955 Europeans. Of the 457,297 SNPs passing POPRES QC2, we removed those with missingness greater than 5%, with minor allele frequency less than 0.01, or with Hardy Weinberg P-value less than 0.005 (the latter two calculations were performed using sample of European ancestry only). From the remaining 326,129 SNPS, we reduced the list to 48,529 SNPs separated by at least 10 Kb which had missingness less than 1%. From these we chose 21,743 tag SNPs using H-clust, set to pick tag SNPs with squared correlation less than 0.04 [Rinaldo et al., 2008].