Chunk #9 — Results — Datasets

Source: Shining a light on dark sequencing: characterising errors in Ion Torrent PGM data.
Embedded: yes

Text

The distribution of read lengths (Figure S1) shows that both the 100 bp and 200 bp One Touch kits were bi-modal with a smaller secondary peak ∼100 bp greater than the expected length, which was more prominent in runs using the 314 chip. These longer reads did not exhibit any deviation in mean G+C content or homopolymer composition from the dominant read length peak for their respective runs. However the majority of these longer reads did not map to the reference genome, but those that could be mapped had an error rate double that of the mean base-error rate across all datasets. Considering these reads have a substantially higher error rate, we recommend the removal of unexpectedly long reads prior to analysis.