Chunk #51 — Methods — Haplotype estimation and genotype imputation on the X chromosome

Source: The UK Biobank resource with deep phenotyping and genomic data.
Embedded: yes

Text

For haplotype estimation on the X chromosome genotype data we applied the same filtering steps as the autosomal genotype data, with some additional filters. For both the sex-specific region and the pseudo-autosomal regions (PAR), samples were excluded which were identified as having a likely sex chromosome aneuploidy (see above). For the PAR, we additionally excluded samples with a missing rate of >5% among markers in the PAR. For the sex-specific region of chromosome X, this resulted in a dataset of 16,601 markers and 486,790 samples. For the PAR this resulted in a dataset of 1,239 markers and 486,476 samples. Haplotype estimation and genotype imputation was carried out on the two pseudo-autosomal regions and the non-pseudo autosomal region separately, and using the same methods and reference datasets used for the autosomes.