paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #10 — Results

Source
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded
yes

Text

Figure 1a shows the genotype discordance at CG1 SNPs. We measure discordance using just the validation genotypes that contain at least one copy of the non-reference allele (ALT) and all validation genotypes (ALL). These results show that the 3 haplotype sets produced by SHAPEIT2 (blue bars) have lower levels of discordance compared to Beagle haplotypes (green) and the 1000GP haplotypes (orange). For example, the CG1 ALT discordance of the SHAPEIT2 haplotypes made using the Omni2.5 scaffold, and the ALT discordance of the 1000GP haplotypes, are 1.03% and 1.38% respectively. In addition, we observe that the Omni2.5 scaffold produced better results than the 1M scaffold, which is in turn better than using no scaffold. Figure 2a-b shows the genotype discordance at CG2 SNPs and indels, where we observe the same pattern of performance between methods. We also find that this pattern holds across different ancestries (Supplementary Fig. 1). The discordance on Indels is worse than on SNPs (Figure 2c). A reason for this difference may be that it is more challenging to map sequencing reads that contain indels, so the GLs for indels may be less informative than GLs at SNPs.