paperKB
coga / coga-kb
Help
Sign in

Chunk #2 — The vast majority of coding variation is rare and previously unknown

Source
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.
Embedded
yes

Text

We observed a total of 503,481 SNVs and 117 fixed non-reference sites, of which 325,843 and 268,903 were found in AAs and EAs, respectively (fig. S9A). Excluding singletons, ∼58% of SNVs were population-specific (93,278 and 32,552 variants were uniquely observed in AAs and EAs, respectively), and the vast majority of these variants were rare (fig. S9B). Most SNVs (292,125 or 58%) were nonsynonymous, including 285,960 missense variants and 6165 nonsense variants (fig. S9C). Synonymous variants accounted for 38% (188,975) of all SNVs (fig. S9C), with the remaining 4% of SNVs (22,381) located in either splice sites or targeted noncoding regions. The majority of SNVs (411,084; 82%) were previously unknown, with more novel SNVs observed in AAs (240,341) than in EAs (204,415), although the proportion of SNVs that were novel was higher in EAs compared with AAs (76.0% versus 73.8%; χ2 = 398.3, df =1, P < 10−16). About 98% (402,813) of novel SNVs were rare, and 48.9% of all novel, rare SNVs were nonsynonymous.