paperKB
coga / coga-kb
Help
Sign in

Chunk #21 — Results and discussion — Motif bias

Source
Characterizing and measuring bias in sequence data.
Embedded
yes

Text

The 'special' motifs (AT)15 and G|C ≥ 80% are included based on anecdotal evidence that contig breaks in assemblies are frequently associated with these motifs. The extents of all the motifs in the reference genomes studied in this paper are presented in Table 1. The decision to attend to regions of 100 to 200 bases was an empirical choice influenced by considerations such as the distribution of fragment sizes in our Illumina libraries. Computing our statistics using larger or smaller regions might make different biases apparent depending on the properties of the assayed data set.