paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #35 — Methods — Marker-based quality control

Source
The UK Biobank resource with deep phenotyping and genomic data.
Embedded
yes

Text

We identified poor quality markers using statistical tests designed primarily to check for consistency of genotype calling across experimental factors. Specifically we tested for batch effects, plate effects, departures from Hardy–Weinberg equilibrium, sex effects, array effects, and discordance across control replicates. See Supplementary Information for the details of each test, and Supplementary Fig. 3 for examples of affected markers. For markers that failed at least one test in a given batch, we set the genotype calls in that batch to missing. We also provide a flag in the data release that indicates whether the calls for a marker have been set to missing in a given batch. If there was evidence that a marker was not reliable across all batches, we excluded the marker from the data altogether. To attenuate population structure effects, we applied all marker-based quality control tests using a subset of 463,844 individuals with estimated European ancestry. We identified these individuals from the genotype data before conducting any quality control by projecting all the UK Biobank samples on to the two major principal components of four 1000