paperKB
coga / coga-kb
Help
Sign in

Chunk #43 — Results — Applying Eigenanalysis to Datasets with Linked Markers

Source
Population structure and eigenanalysis.
Embedded
yes

Text

We want to carefully distinguish here between n, the actual number of columns of our data array, and n′, a theoretical statistical parameter, modeling the approximate Wishart distribution of the square matrix X. We originally fit σ,n′ by maximum likelihood. The likelihood, as a function of the two parameters, has two sufficient statistics, which are , and . Maximum likelihood did not always work well, in our genetic applications, probably because is sensitive to small eigenvalues, while we are only interested in large. We recommend a moments estimator: which is justified later. Note that n′ is invariant to scaling of the matrix M as it should be. We estimate σ by: