paperKB
coga / coga-kb
Help
Sign in

Chunk #3 — Background

Source
Characterizing and measuring bias in sequence data.
Embedded
yes

Text

Sequencing technologies are vulnerable to multiple sources of bias. Methods based on bacterial cloning and Sanger-chemistry sequencing [8] were subject to many coverage-reducing biases, notably at GC extremes, palindromes, inverted repeats, and sequences toxic to the bacterial host [9-17]. Illumina sequencing [18] has been shown to lose coverage in regions of high or low GC [19-22], a phenomenon also seen in other 'next-generation' technologies [3,6]. PCR amplification during library construction is a known source of undercoverage of GC-extreme regions [20,21] and similar biases may also be introduced during bridge PCR for cluster amplification on the Illumina flowcell [23]. Illumina strand-specific errors can lead to coverage biases by impairing aligner performance [24]. Ion Torrent [25], like 454 [26], utilizes a terminator-free chemistry that may limit its ability to accurately sequence long homopolymers [4,27,28], and may also be sensitive to coverage biases introduced by emulsion PCR in library construction. Complete Genomics [29] also uses amplification along with a complex library construction process. The Pacific Biosciences [30] process is amplification-free; therefore, one might expect it to exhibit lower levels of coverage bias than the other technologies.