Chunk #22 — Haplotype estimation and genotype imputation

Source: The UK Biobank resource with deep phenotyping and genomic data.
Embedded: yes

Text

Extended Data Fig. 4 shows the distribution of information scores on all markers in the imputed dataset. An information score of α in a sample of M individuals indicates that the amount of data at the imputed marker is approximately equivalent to a set of perfectly observed genotype data in a sample size of αM. The figure illustrates that most markers above 0.1% frequency have high information scores. Previous GWAS have tended to use a filter on information around 0.3 that roughly corresponds to an effective sample size of approximately 150,000. Thus, it may be possible to reduce the information score threshold and still obtain good power to detect associations.