If a particular sequence-context word is not contained at least some minimum × number of times (default × = 4) within the reference genome, then the model is updated using information based on the longest nucleotide sequence that occurs at least × times in the reference. For instance, if AACTG is missing from the reference, however ACTG can be found five times, then the matrix entry where T is the current position, G is the following position, and AAC are the three bases before the current position is updated by considering all other matrix entries where the two bases before the current position are AC, the current position is T, and G is the following position.