Chunk #13 — Materials and methods — Unsupervised learning to determine POU clustering

Source: Genome-wide association study of problematic opioid prescription use in 132,113 23andMe research participants of European ancestry.
Embedded: yes

Text

Previous studies have shown that consumption and misuse/dependence phenotypes have a distinct genetic architecture. To explore whether POU clustered more with consumption or misuse/dependence phenotypes, we used a data-driven unsupervised machine learning method known as agglomerative hierarchical clustering analysis (HCA) [33]. HCA forms clusters iteratively by creating groups and successively joining or splitting those groups based on a prespecified algorithm [33]. Agglomerative nesting (AGNES) is a bottom-up process focused on individual traits to structure. Agglomerative clustering was chosen as this allowed us to compare different algorithms to maximize for the dissimilarity on each branch, with Ward’s minimum variance method performing best. All models were fit in R using the cluster package [33].