paperKB
coga / coga-kb
Help
Sign in

Chunk #5 — METHODS — LOW-DIMENSIONAL EMBEDDING BY EIGEN-ANALYSIS

Source
Discovering genetic ancestry using spectral graph theory.
Embedded
yes

Text

Record the minor allele count ϒij for the ith subject and the jth SNP in a matrix ϒ, for i = 1,…, N and j = 1,…, L. Center and scale the allele counts by subtracting the column mean and dividing by the standard deviation of each entry to update ϒ. The elements of the ith row of ϒ represent the genetic information for subject i, yi = (yi1, …, yiL). The similarity between individuals i and j is computed using the inner product hij = L−1〈yi, yj〉. The corresponding inner product matrix H = L−1ϒϒt associated with standard PCA is positive semi-definite. Traditionally, in PC maps, the ancestry vectors are estimated by embedding the data in a low-dimensional space using the eigenvectors of the matrix H, the kernel of the PC map. To find the embedding compute the eigenvectors (u1,u2,…, uN) and eigenvalues (λ1≥λ2≥ ⋯ ≥λN) of H. Typically the large eigenvalues correspond to eigenvectors that reveal important dimensions of ancestry.