Chunk #5 — Results

Source: Statistical modeling for sensitive detection of low-frequency single nucleotide variants.
Embedded: yes

Text

In the Result section, we first briefly summarized the benchmark datasets used for PSEM. Then we compared the testing benchmark dataset from Ion Proton with Illumina MiSeq sequencing data in terms of allele frequency composition and depth distribution. Utilizing count data visualization plots and tabulation, we selected the candidate distributions that may fit the data. Since the Ion Proton dataset contains 3 times of the number of benchmark SNVs from Illumina MiSeq and also is enriched with SNVs of ≤ 1 % allele frequency, we mainly focused on Ion Proton data set for model development and evaluation. To test the generalizability of the PSEM, we further trained and evaluated it on Illumina MiSeq dataset.