paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #43 — Methods — White British ancestry subset

Source
The UK Biobank resource with deep phenotyping and genomic data.
Embedded
yes

Text

Researchers may want to only analyse a set of individuals with relatively homogeneous ancestry to reduce the risk of confounding due to differences in ancestral background. Although the UK Biobank cohort includes a large number of participants from a wide range of ethnic backgrounds, such analysis is feasible without compromising too much in sample size because most participants in the UK Biobank cohort report their ethnic background as ‘British’, within the broader-level group ‘white’ (88.26%). Our PCA revealed population structure even within this category (Supplementary Fig. 8), so we used a combination of self-reported ethnic background and genetic information to identify a subset of 409,728 individuals (84%) who self-report as ‘British’ and who have very similar ancestral backgrounds based on results of the PCA (Supplementary Information). Fine-scale population structure is known to exist within the UK but methods for detecting such subtle structure40 available at the time of analysis are not feasible to apply at the scale of the UK Biobank. The white British ancestry subset may therefore still contain subtle structure present at sub-national scales.