Chunk #22 — Experimental design — Setting tag family size by PCR amplification

Source: Detecting ultralow-frequency mutations by Duplex Sequencing.
Embedded: yes

Text

The sequencing library is PCR-amplified for the purpose of creating multiple copies of each strand of a double-stranded DNA molecule. The number of DNA fragments used in the PCR, along with the fraction of a sequencing lane dedicated to a particular sample, are the primary adjustable variables that dictate the number of sequencing reads that share the same tag sequence (i.e., tag family size), which strongly influences the final number of DCSs formed. If there are too few reads sharing the same tag sequence (i.e., small tag family size), a consensus sequence cannot be calculated; conversely, too many reads having the same tag sequence (i.e., large tag family size) wastes sequencing capacity without appreciably improving data yield. Because the number of reads varies between different tag families and occur in a distribution (Fig. 4), we use ‘peak family size’ as our preferred metric to refer to tag family sizes generated under a given set of conditions. This distribution occurs during PCR amplification, and it is the result of different amplification efficiencies of the DNA molecules present in the library. By