paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #21 — Discussion

Source
An integrated map of genetic variation from 1,092 human genomes.
Embedded
yes

Text

The utility and cost-effectiveness of collecting multiple data types (low-coverage whole genome sequence, targeted exome data, SNP genotype data) for finding variants and reconstructing haplotypes are demonstrated here. Exome capture provides private and rare variants that are missed by low-coverage data (approximately 60% of the singleton variants in the sample were detected only from exome data compared to 5% only detected from low-coverage data, Fig. S15). However, whole-genome data enable characterisation of functional non-coding variation and accurate haplotype estimation, which are essential for the analysis of cis-effects around genes, for example those arising from variation in upstream regulatory regions38. There are also benefits from integrating SNP array data, for example to improve genotype estimation39 and to aid haplotype estimation where array data have been collected on additional family members. In principle, any sources of genotype information (e.g., from array CGH) could be integrated using the statistical methods developed here.