paperKB
coga / coga-kb
Help
Sign in

Chunk #29 — Results and Discussion — SNP-calling

Source
GemSIM: general, error-model based simulator of next-generation sequencing data.
Embedded
yes

Text

Any inaccurate SNP calls within +/- 1% of a known haplotype frequency were classed as false positives. For Illumina v5 and Roche/454, all false positives had a frequency under 1% (+/- 1%). Illumina v4 showed a drastically increased false positive rate, as can be expected from the higher average error rate of this run. Despite this, false positives were still restricted to under 3% (+/- 1%) population frequency. As M.A.Q. was increased, however, some false positives with a true frequency of 1% (+/-1%) were now given a frequency by VarScan of 3% (+/-1%). This can be seen as spikes in the frequency = 3% graph, between M.A.Q. 30 and M.A.Q. 40. This can be understood by considering how the M.A.Q. parameter works. The VarScan manual states that M.A.Q. is the 'minimum base quality at a position to count a read' [23]. This means VarScan uses M.A.Q. to select a subset of reads from which to make a call. Thus increasing M.A.Q. reduces the sampling of reads, which in turn reduces the accuracy of the SNP frequency calculation when frequencies are