paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #4 — TOPMed WGS quality assessment

Source
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.
Embedded
yes

Text

Stringent variant and sample quality filters were applied and the resulting genotype call sets were evaluated in several ways (Supplementary Information 1.2.2, 1.3, 1.4). First, we compared genotypes for samples sequenced in duplicate (the mean alternative allele concordance was 0.9995 for single-nucleotide variants (SNVs) and 0.9930 for insertions or deletions (indels)). Second, we compared genotypes to those from previous whole-exome sequencing datasets (protein-coding regions from GENCODE15; 80% of variants were found with both approaches and overlapping variant calls had a concordance of 0.9993 for SNVs and 0.9974 for indels) (Supplementary Tables 1–3). Third, we compared genotypes to those obtained using alternative informatics tools (compared to GATK v.4.1.3, TOPMed has lower Mendelian inconsistency rates and minimizes batch effects) (Supplementary Table 4). These reproducibility estimates indicate the high quality of the genotype calls and effectiveness of machine-learning-based quality filters.