Chunk #11 — Statistical analysis

Source: Using Patterns of Genetic Association to Elucidate Shared Genetic Etiologies Across Psychiatric Disorders.
Embedded: yes

Text

Results of the LPA models were compared to the results from subsets of the data and alternative methods to examine how consistent LPA results are. First, we fit LPA models with an increasing number of classes to a pruned subset of SNPs. Because SNPs in high linkage disequilibrium (LD) are correlated, they may distort the classification of SNPs by causing classes of correlated SNPs driven by LD structure. To exclude this possibility, we created a subset of SNPs by pruning SNPs with r2 > 0.1 and fit LPA models to this pruned subset of SNPs. Pruning was done in PLINK software version 1.07 (Purcell et al. 2007). Second, we applied k-means clustering to −log10(p-values) from the GWAS. Unlike LPA, k-means clustering is a non-parametric clustering method that does not have within-class normality assumption. K-means clustering with varying k were applied to examine how the number of clusters and class profiles from non-parametric model compare to the LPA results. Results of k-means clustering depend more on the choice of the distance metrics and their scales than in LPA (Magidson and Vermunt