Chunk #27 — Methods — Overall workflow

Source: Statistical modeling for sensitive detection of low-frequency single nucleotide variants.
Embedded: yes

Text

For position specific error model training, we used the invariant loci from training benchmark. Genomic sequence context features were extracted for each locus and then fed to the generalized linear models using 4 different distributions. Then testing benchmark paired tumor and normal sequencing data went through the PSEM and the candidate SNVs were derived. Additional file 11 provides a diagram illustrating this procedure. In the following method section, we first introduced the benchmark datasets from both Ion Proton and Illumina MiSeq. Then we described the application of generalized linear models for PSEM. Last, we described the performance evaluation metrics.