the biology of the disease based on biochemical and physiological studies. For example, for cancer, the most obvious candidates are genes that are mutated somatically or epigenetically changed in their expression in a significant proportion of cancers. Case groups should be chosen to be enriched for the presence of rare variants. Generally these will include cases with one or more close relatives affected, but which are not clearly familial, and, especially for cancer, with an early age of onset. Control populations should ideally consist of individuals known to be free of the disease. Selection of large numbers of controls whose provenance is known will help to minimize population stratification effects.