Chunk #32 — Methods — Genome-wide genotyping array data sets used for evaluation of imputation quality and/or phenotype association analysis — UK Biobank

Source: Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations.
Embedded: yes

Text

UK Biobank [46] recruited 500,000 people aged between 40–69 years in 2006–2010, establishing a prospective biobank study to understand the risk factors for common diseases such as cancer, heart disease, stroke, diabetes, and dementia). Participants are being followed-up through routine medical and other health-related records from the UK National Health Service. UK Biobank has genotype data on all enrolled participants, as well as extensive baseline questionnaire and physical measures and stored blood and urine samples. Hematological traits were assayed as previously described [14]. Genotyping on custom Axiom arrays and subsequent quality control has been previously described [47]. Samples were included in our analyses if ancestry self-report was “Black Carribean”, “Black African”,” Black or Black British”, “White and Black Carribean”, “White and Black African”, or “Any Other Black Background”. Variants were selected based on call rate exceeding 95%, HWE p-value exceeding 10−8, and MAF exceeding 0.5%. Subsequently, variants in approximate linkage equilibrium were used to generate ten principle components. Samples were excluded if the first principal component exceeded 0.1 and the second principal component exceeded 0.2, to exclude individuals not clustering with most African ancestry individuals. In total, 6,820 AA participants with blood cell traits were included in the analysis.