paperKB
coga / coga-kb
Help
Sign in

Chunk #42 — Methods — Principal component analysis

Source
The UK Biobank resource with deep phenotyping and genomic data.
Embedded
yes

Text

We computed principal components using an algorithm (fastPCA38) that performs well on datasets with hundreds of thousands of samples by approximating only the top n principal components that explain the most variation, in which n is specified in advance. We computed the top 40 principal components using a set of 407,219 unrelated, high quality samples and 147,604 high quality markers pruned to minimise linkage disequilibrium39. We then computed the corresponding principal component-loadings and projected all samples onto the principal components, thus forming a set of principal component scores for all samples in the cohort (Supplementary Information).