There are various forms of PCA for genetic data depending on how the covariance matrix is calculated. Here we follow the EIGENSTRAT method [15], [16]. We first standardize each genotype coding as , with being the allele frequency for the m th marker. In the following discussion, we always use the standardized genotype and still represent it as gi,m, and organize all genotypes into the matrix G = (gi,m)N ×M. We obtain the sample covariance matrix for the M markers. For the PCA, we find the first L (say L = 3) largest eigenvalues of Σ, and their corresponding normalized eigenvectors, v 1, v 2, …, v L, with v l = (vl ,1,vl ,2,…,vl ,M)′, 1≤l≤L. For the ith subject with genotypes (standardized as above) g i = (gi ,1,gi ,2,…,gi ,M)′, its lth principal component (PC) is given by , 1≤l≤L. Thus, v l defines the PC direction with the lth largest “genetic” variation and ul,i is the ith subject's projected position onto this axis. Following Patterson et al., the significance level of the genetic variation along a given PC direction is evaluated by the Tracy-Widom test [16].