paperKB
coga / coga-kb
Help
Sign in

Chunk #36 — Online Methods — Sample filtering

Source
Analysis of protein-coding genetic variation in 60,706 humans.
Embedded
yes

Text

The 91,796 samples were filtered based on two criteria. First, samples that were outliers for key metrics were removed (Extended Data Figure 5b). Second, in order to generate allele frequencies based on independent observations without enrichment of Mendelian disease alleles, we restricted the final release data set to unrelated adults with high-quality sequence data and without severe pediatric disease. After filtering, only 60,706 samples remained, consisting of ~77% of Agilent (33 Mb target) and ~12% of Illumina (37.7 Mb target) exome captures. Full details of the filtering process are described in the Supplementary Information Section 1.7.