paperKB
coga / coga-kb
Help
Sign in

Chunk #0 — Background

Source
GemSIM: general, error-model based simulator of next-generation sequencing data.
Embedded
yes

Text

Next-generation sequencing (NGS) technologies, such as Illumina's Genome Analyzer [1] and Roche/454's GS FLX [2], produce massive volumes of data. For instance, Illumina's Genome Analyzer IIx can produce up to 640 million 150 bp paired-end reads in a single run [3]. Increasing availability of high volume data is opening new possibilities to researchers. These include assessment of rare variants in viral populations via deep sequencing, metagenomic sequencing of bacterial communities, and pooled resequencing of human chromosomes. Extracting meaningful information from these kinds of sequencing projects is often difficult, however, due to the error rates associated with NGS. Separating true variants from sequencing errors remains challenging. Furthermore, analysis is complicated by an ever-increasing variety of downstream software, and a lack of clear standards [4]. Both selecting the most appropriate sequencing technology, and choosing the appropriate software package and parameter values for data analysis are typically done via a 'hit and miss' approach - a costly exercise, even in the world of 'cheap' NGS.