paperKB
coga / coga-kb
Help
Sign in

Chunk #12 — Whole-genome genotyping

Source
The UK Biobank resource with deep phenotyping and genomic data.
Embedded
yes

Text

The application of our quality control pipeline resulted in the released dataset of 488,377 samples and 805,426 markers from both arrays with the properties shown in Fig. 2a–c. A set of 588 pairs of experimental duplicates show very high genotype concordance, with mean 99.87% and minimum 99.39% of genotypes identical (Supplementary Fig. 13). We compared allele frequencies among UK Biobank participants with European ancestry to those estimated from an independent source, the Exome Aggregation Consortium (ExAC) database18 at a set of 91,298 overlapping markers. We do not expect allele frequencies in the two studies to match exactly owing to subtle differences in the ancestral backgrounds of the individuals in each study, as well as differences in the sensitivity and specificity of the two technologies (exome sequencing and genotyping arrays). A small number of markers (around 300) have very different allele frequencies (see Supplementary Information section 2.4). This could be due to non-working probesets on the UK Biobank arrays or possibly annotation error on the UK Biobank arrays or in ExAC, or mapping errors in the sequence data in regions of more complex variation. Despite this, overall the allele frequencies are encouragingly similar (r2 = 0.93) (Fig. 2c; Supplementary Fig. 4).