paperKB
coga / coga-kb
Help
Sign in

Chunk #12 — The distribution of genetic variation

Source
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.
Embedded
yes

Text

We examined the distribution of variant sites across the genome by counting variants across ordered 1-megabase (Mb) concatenations of contiguous sequence with a similar conservation level (indicated by combined annotation-dependent depletion (CADD score21), and in segments categorized by coding versus noncoding status (Fig. 1 and Extended Data Fig. 2). As expected, the vast majority of human genomic variation is rare (minor allele frequency (MAF) < 0.5%)10,11 and located in putatively neutral, noncoding regions of the genome (Fig. 1). Although coding regions have lower average levels of both common (MAF ≥ 0.5%) and rare variation, we identified some ultra-conserved noncoding regions with even lower levels of genetic variation22 (Fig. 1 and Supplementary Fig. 15).