paperKB
coga / coga-kb
Help
Sign in

Chunk #0 — Quality Control

Source
Critical Issues in the Inclusion of Genetic and Epigenetic Information in Prevention and Intervention Trials.
Embedded
yes

Text

Maximizing genotype accuracy is a key step in increasing the power to detect true genotype-phenotype relationships. Marker and subject-level checks are performed to ensure data precision. These steps are more thoroughly described elsewhere (Anderson et al., 2010) but reviewed here briefly. First, on a per-marker level, Hardy-Weinberg equilibrium (HWE) is tested and markers that exhibit large deviations from the expected distribution of genotypes, given the observed allele frequencies, are removed. The rationale for this step is to eliminate any markers that may exhibit evidence of systematic genotype error (e.g., excess failure of a particular allele). It is important to recognize that minor deviations from HWE will be observed by chance when testing such a large number of markers. Thus, it is typical to use a stringent criterion for dropping markers exhibiting Hardy-Weinberg disequilibrium (p < 0.001). In addition, markers exhibiting sample-wide genotype call rates of less than 95% can also be eliminated. A high frequency of missing data is generally interpreted as an indicator of a poor quality marker. If cases and controls are present in a dataset then differences