paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #2 — Regional sequencing

Source
Integrating common and rare genetic variation in diverse human populations.
Embedded
yes

Text

Unlike SNPs present on microarray platforms, which are intentionally biased towards high frequency by the discovery and selection process, the SNPs discovered by sequencing provide a direct estimate of the underlying allele frequency spectrum in each population. As in previous surveys, common (MAF ≥5%) and low-frequency (MAF = 0.5–5%) variants account for the vast majority of the heterozygosity in each sample, but we also observed a large number of rare (MAF = 0.05–0.5%) and private (singletons and MAF <0.05%) variants (see Supplementary Table 2 for definitions of variant frequency classes). Each population had 42– 66% of sites with a MAF <5%, compared to 10–13% in the genotyping data; 37% of SNPs with a MAF <0.5% were observed in only one population. In total, 77% of the discovered SNPs were new (that is, not in the SNP database (dbSNP) build 129) and 99% of those had a MAF <5%.