Chunk #16 — Some general methodological considerations — Sample selection

Source: Does nature have joints worth carving? A discussion of taxometrics, model-based clustering and latent variable mixture modeling.
Embedded: yes

Text

Several aspects of the sample selection can have an impact on the desired inference. As might be expected, sample size is directly related to the statistical power to detect subgroups within the population (Lubke & Neale, 2006, 2008). Larger samples provide a smoother approximation of the population than smaller samples, and sampling fluctuation (i.e. the variability if multiple samples were drawn) has less impact. Sampling fluctuation in smaller samples can lead to erroneous decisions in favor of categories or dimensions. Figure 3 shows two draws from the same two-cluster distribution. Figure 3a has a sample size of n=100, which can look like a single cluster (the red dots are subjects of the second cluster but are not really distinguishable from the black dots). Figure 3b shows a draw from the same distribution but now with n=300, which provides a clearer picture. Fitting a mixture model to data in Fig. 3b permits two classes to be distinguished. In addition to sample size, it is necessary to consider how the sample is drawn. Any oversampling of subgroups that exists within the population