Chunk #25 — Methods — Analytical methods — SMA models

Source: The CHRNA5/A3/B4 gene cluster and tobacco, alcohol, cannabis, inhalants and other substance use initiation: replication and new findings using mixture analyses.
Embedded: yes

Text

All three models are also fitted as a mixture model with two classes. One of the classes is fitted as in the above single class models whereas the other class serves to group subjects who are “long-term survivors”, that is, subjects who never use. To fit the 2-class models, we created so-called “training variables”, which are used to aid in the distinction between classes. Training variables are binary variables that indicate whether or not class membership is known. For subjects who have initiated use it is known that they cannot belong to the non-user, long-term survivor class. For these subjects the training variable is zero, reflecting that class membership does not need to be estimated. All other subjects might still engage in substance use in the future, and hence it needs to be estimated from the data whether they are more likely to belong to the users or the non-users class. For these subjects the training variable equals one, indicating that class membership needs to be estimated. Potential future users and true long-term survivors are distinguished during model fitting based