paperKB
coga / coga-kb
Help
Sign in

Chunk #41 — Methods — — Statistical analyses.

Source
Analysis and application of European genetic substructure using 300 K SNP information.
Embedded
yes

Text

PCA, PCA control for association testing, and determination of the genomic control parameter (λgc) [14] was determined using the EIGENSTRAT statistical package [11]. Several tests were used to assess the significance of PCA. As suggested previously [7], both analysis of variance (ANOVA) and a split half reliability test adjusted by the Spearman-Brown formula [41] were performed. The ANOVA examined the statistical significance of the difference in PC scores among individual groups pre-assigned based on self-identification. The split half reliability test can determine whether independent (non-overlapping) SNP sets provide the same or different results. Unlike ANOVA this test does not rely on correct pre-knowledge of group assignment. For the absence of population structure the null hypothesis is that there will be no correlation in the PCA results. The split half reliability test was performed three times using 1) alternate chromosomes, 2) alternate half chromosomes, and 3) half genome SNP sets. These sets were chosen to eliminate any dependency in each test between the two half datasets based on linkage disequilibrium. Thus, correlation of the independent SNP sets should be due to