paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #38 — Online Methods — Virtual tumor benchmarking approach

Source
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples.
Embedded
yes

Text

In order to measure sensitivity, we turn to additional sequencing data on a second individual (Supplementary Table 5). In this case we chose NA12891 that was also sequenced to 60x as part of the 1000 Genomes Project. Using the published high confidence SNP genotypes for those samples from the 1000 Genomes Project, we identify a set of sites that are heterozygous in NA12891 and homozygous for the reference in NA12878. We then used a second utility, SomaticSpike, which is part of the MuTect software package, to perform a mixing experiment in-silico. At each of the selected sites, this utility attempts to replace a number of reads determined by a binomial distribution using a specified allelic fraction in the NA12878 data with reads from the NA12891 data, therefore simulating a somatic mutation of known location, type and expected allele fraction. If there are not enough reads in NA12891 to replace the desired reads in NA12878 the site is skipped. The output of this process is a virtual tumor BAM with the in-silico variants and a set of locations of those variants. Sensitivity is then estimated by attempting to detect mutations at these sites.