Chunk #43 — Results — Applying Eigenanalysis to Datasets with Linked Markers

Source: Population structure and eigenanalysis.
Embedded: yes

Text

We want to carefully distinguish here between n, the actual number of columns of our data array, and n′, a theoretical statistical parameter, modeling the approximate Wishart distribution of the square matrix X. We originally fit σ,n′ by maximum likelihood. The likelihood, as a function of the two parameters, has two sufficient statistics, which are , and . Maximum likelihood did not always work well, in our genetic applications, probably because is sensitive to small eigenvalues, while we are only interested in large. We recommend a moments estimator: which is justified later. Note that n′ is invariant to scaling of the matrix M as it should be. We estimate σ by: