Chunk #11 — MATERIALS AND METHODS — Fuzzy Clustering Analysis

Source: Autosomal linkage scan for loci predisposing to comorbid dependence on multiple substances.
Embedded: yes

Text

We used fuzzy clustering analysis [Kaufman and Rousseeuw, 1990] to reduce the phenotype dimensions and to obtain more detailed information of data structures, rather than use each categorical phenotype alone, or typical “hard” clustering methods. The five SD traits, including AD, CD, CanD, OD and ND, for the 1,758 study subjects were the input data for the fuzzy clustering analysis. In a typical (non-fuzzy) partition, each subject is assigned to only one cluster. As a result, this approach is referred to as “hard clustering” because a clear-cut decision is made on each subject’s cluster membership. In contrast, fuzzy clustering allows for some ambiguity of belonging (i.e., cluster membership). In this approach, coefficients (probabilities ranging from 0 to 1) of cluster membership are derived for each subject. Given the substantial comorbidity of SDs and the possible shared genetic liability, fuzzy cluster analysis has the advantage of providing correlated data structures that are drawn from inherent comorbidity and/or underlying shared genetic liability, and provides a way to reduce and mine data. The membership coefficients can be considered more homogeneous quantitative traits for