The human genome consists of over 3 billion base pairs and was sequenced as part of the Human Genome Project that started in 1990. The ‘final’ version of the sequence was published in 2003, and estimated at the time to be 92% complete1. It showed that the human genome consists of ~ 22,000 genes (see Box 1 for a glossary of terms), and that human individuals are identical for ~99.5% of their sequence, with the small remaining part variable to differing extents. Since this variation could have an important role in explaining differences in genetic susceptibility to disease, comparing variation between diseased (cases) and healthy (control) individuals from the same population may elucidate which genetic pathways are involved in disease onset and progression 2.