Sequencing libraries were quantified using quantitative PCR (Kapa Biosystems, Woburn, MA, USA), normalized to 2 nM and denatured using 0.1 N NaOH prior to sequencing. Flowcell cluster amplification and sequencing were performed according to the manufacturer's protocols (Illumina, CA, USA) using HiSeq 2000 v2 (data sets 10 to 12, and A2), HiSeq 2000 v3 (data sets 13, 14, A1, and A3), HiSeq 2500 v1 (data sets A10 and A11), MiSeq v1 (data sets 1, 4, and 7), or Miseq v2 (data sets A4 to A9) cluster chemistry and flowcells. HiSeq data were analyzed using Illumina RTA v1.10.15 or RTA v.1.12.4.2. MiSeq data were analyzed using RTA v1.13 or v1.14.23. Read lengths were 2 × 251 bases for MiSeq data sets 1, 4, and 7; 2 × 150 bases for MiSeq data sets A4 to A9; 2 × 101 bases for HiSeq data sets 10 to 14 and A1 to A3; and 2 × 250 bases for HiSeq 2500 data sets A10 to A11. Data were further processed using the Picard data-processing pipeline [42] to generate BAM files. Alignment was performed