There is a much closer relationship between our PCA and a cluster-based analysis than is at first apparent. Consider a model of genetic structure where there are K populations, and fix a marker and variant allele. The populations have diverged from an ancestral population recently. Suppose that the allele frequency of the variant in the ancestral population is P, and in population i is pi. Conditional on P, assume that p = (p 1,p 2,…pK) has mean (P,P,…P) and covariance matrix P(1 − P)B for some matrix B. Much past work in genetics uses this paradigm, with variations on the distribution of B, and on the detailed distribution of p conditional on P; for instance, both Nicholson et al. [18] and STRUCTURE [9] in “correlated frequency mode,” and in the “F-model” of [10]. The setup here is nearly inevitable if one is considering allele frequencies in populations that have only diverged a small amount. In [18, p. 700] for the case of a diagonal matrix B, it is shown that the diagonal term Bii can be interpreted as the “time on the diffusion time-scale” (inversely proportional to effective population size) in which population i has undergone genetic drift.