paperKB
coga / coga-kb
Help
Sign in

Chunk #4 — The vast majority of coding variation is rare and previously unknown

Source
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.
Embedded
yes

Text

Of the total SNVs, 57% (285,857) were singletons, and SNVs with three or fewer minor alleles accounted for 72% of all variants (fig. S9D). The proportion of singletons observed in AAs (49.3%, n = 140,818) was lower than that observed in EAs (50.7%, n = 144,821), but the overall site frequency spectra (SFS) and the SFS for both AAs and EAs are highly skewed, exhibiting a large excess of rare variants relative to the standard neutral model (19) (fig. S9D). The skew of the SFS was greater for EAs than AAs, and, at equal sample sizes, the odds that a SNV was a singleton were 1.7 times greater for EAs than AAs. Consistent with these observations, Tajima's D was highly negative for both EAs (−2.12) and AAs (−2.10) and dropped precipitously as sample size increased (fig. S9E), highlighting that sequencing a large number of individuals provides unique information about recent demographic history (13, 20, 21).