the right justification, alternative coding schemes can be introduced. For example, if the literature on a specific SNP strongly suggests that the mere presence of a particular allele confers risk, the genotype may be coded to reflect the dominant influence of that allele (i.e., absence versus presence, irrespective of the number of copies). By contrast, researchers interested in determining the extent to which an allele operates in a non-additive manner could pair the additive term described above with a term differentiating heterozygous from homozygous genotypes, though this model would require an additional degree of freedom. What's most important to remember is that these decisions should be based on prior research findings, biologically based theory, and in consultation with individuals who have formal training in genetics. Moreover, regardless of how the data get coded, the degree of association between the outcome of interest and a set of genetic variants should roughly map onto the pattern of LD across those variants. That is to say, if you are testing a group of highly correlated SNPs, you would expect a similar association pattern across them. In contrast, if SNPs are located in different LD blocks (i.e., have a low correlation), it would not