Chunk #12 — Statistical analysis

Source: Using Patterns of Genetic Association to Elucidate Shared Genetic Etiologies Across Psychiatric Disorders.
Embedded: yes

Text

to examine how the number of clusters and class profiles from non-parametric model compare to the LPA results. Results of k-means clustering depend more on the choice of the distance metrics and their scales than in LPA (Magidson and Vermunt 2002). For our analysis, we used −log10(p), the class indicator used in LPA, as the distance metric for k-means clustering, because our goal was to examine whether the results from different methods using the same indicators converge. We used the Hpclus and Fastclus procedures in SAS/STAT™ software version 9.4 (SAS Institute, Cary NC) for k-means clustering. Lastly, we fit the LPA models with different number of classes to randomly split-half subsets of our sample to examine whether the results from the full sample replicate in smaller sample sizes. For the split-half samples, we created two subsets of randomly split-half samples with equal sizes from the full sample, without replacement. We then ran separate GWASs in each sample and fit LPAs using resulting p-values.