Chunk #21 — Haplotype estimation and genotype imputation

Source: The UK Biobank resource with deep phenotyping and genomic data.
Embedded: yes

Text

We also imputed the UK Biobank using the merged UK10K and 1000 Genomes phase 3 reference panels27, which has 87,696,888 bi-allelic markers. We combined this imputed data with that from the HRC panel, using the HRC imputation when a SNP was present in both panels. Imputation was carried out with the IMPUTE4 program (https://jmarchini.org/software/), which is a re-coded version of the haploid imputation functionality implemented in IMPUTE223 (see Methods). The result of the imputation process is a dataset with 93,095,623 autosomal SNPs, short indels and large structural variants in 487,442 individuals. We imputed an additional 3,963,705 markers on the X chromosome (Methods). The SNP database (dbSNP) reference SNP (rs) IDs were assigned to as many markers as possible using reference SNP ID lists available from the UCSC genome annotation database for the GRCh37 assembly of the human genome (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/).