Chunk #5 — Methods — Existing methods — Principal component analysis (PCA)

Source: An efficient genome-wide association test for multivariate phenotypes based on the Fisher combination function.
Embedded: yes

Text

The principal component analysis (PCA) [9, 10] is another classical statistical method for multivariate analysis. The primary objective of PCA is to find a small set of linear combinations of the original variables (i.e. principal components) that account for the most variability in the original variables. Thus, it can be employed to reduce the dimension of multivariate phenotypes. The PCA has been used in gene-based studies to increase the power of statistical testing [11, 12]. Furthermore, He et al. [13] has used PCA to combine four highly correlated obesity phenotypes for a whole genome linkage scan. When the phenotypes are highly correlated, the first principal component (corresponding to the largest eigenvalue) contains most information about the phenotype data. Thus, testing the association between a SNP and the first principal component is a commonly adopted approach to effectively change the multivariate setting associated with multiple phenotypes in GWAS to the univariate setting (e.g. Zhang et al. [14] and Karasik et al. [15]). In this study, we investigate the statistical properties of this approach.