sample-wide genotype call rates of less than 95% can also be eliminated. A high frequency of missing data is generally interpreted as an indicator of a poor quality marker. If cases and controls are present in a dataset then differences in missing rates between cases and controls can be used as a criterion for marker elimination. Lastly, although this step has become less common with an increasing interest in rare variants, SNPs with a minor allele frequency below a threshold are eliminated in the interest of power. On the per subject level, individuals for whom greater than 5% of the total markers assessed fail to be genotyped (i.e., cannot be called or are otherwise missing) are also eliminated, as this is likely due to poor DNA quality. In addition, subjects with a genetic sex (e.g., heterozygous X chromosome markers in a subject purported to be male) inconsistent with the reported sex are eliminated. Lastly, the availability of large-scale genetic data allows for the estimation of identity-by-state between individuals in a dataset. Identity-by-state can be used as a proxy for identity-by-descent, or the degree of allele sharing due to common ancestry, a common measure of relatedness. Obviously, in samples where families