paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #43 — Methods — — Statistical analyses.

Source
Analysis and application of European genetic substructure using 300 K SNP information.
Embedded
yes

Text

PCA can be sensitive to quality control issues that can give rise to spurious clustering [43]. Several factors in our design and execution mitigate against this possibility. First the individuals from different ancestry groups and the Irish group in particular were randomly distributed over plates. Furthermore, the genotyping of approximately half the individuals was performed separately. Comparison of the first run and the second run showed very similar results with respect to the distribution of self identified ancestry groups. As indicated in the methods, we used both genotype completeness as well as a loose (p < 0.00001) HW exclusion to exclude SNPs with genotype artifacts. Finally, as shown in Table S3, independent random sets (three) showed very strong correlations with the 300K set for PC1 and PC2 (r2 all above 0.93).