Instead of constructing statistics based on the frequencies of individual or collapsed variants, statistics that reflect the similarity of the unique DNA sequences possessed by individuals can be constructed. Such statistics have their roots in the assessment of cross-species orthology, protein family determination, phylogeny construction and a number of other molecular genetic analyses based on DNA sequence similarity and are more or less agnostic to the frequencies of the variants being considered.20, 51 The main motivation for similarity-based approaches to assessing rare variant associations is that the general nucleotide background or context within which a rare variant can influence a phenotype may be important. Thus, such approaches assume some form of interaction among variants or at least a simple shaping of gene function by the balance of variations an individual possesses.