Chunk #18 — Results — Application of ZINB PSEM on Illumina MiSeq data

Source: Statistical modeling for sensitive detection of low-frequency single nucleotide variants.
Embedded: yes

Text

Despite similar statistical modeling schema can be readily generalized to Illumina MiSeq data set, Illumina MiSeq and Ion Proton sequencers differ significantly in terms of sequencing chemistry. The former is based on sequencing-by-synthesis (SBS) that relies on high-resolution optic systems, whereas the latter is based on Ion semiconductor sequencing where no modified nucleotides or optics are required. The differences in sequencing mechanisms make Ion Proton sequencers run faster but are prone to homopolymer related errors. Comparing the NB GLM regression coefficients on both datasets (Additional file 6), homopolymer related features significant in Ion Proton data set regression are either insignificant (hmer_len, hmer_dist) or show opposite effect (hmer_op, hmer_den) on the error rate. The same trend was also observed in ZIP and ZINB models comparing Ion Proton with Illumina MiSeq (Additional files 7, 8, 9 and 10).