The international HapMap project was initiated to characterise LD across the human genome. In phase I of the study ~1 million common SNPs were genotyped in four different sample sets of 90 Yoruba in Ibadan from Nigeria (YRI), 45 Japanese in Tokyo Japan (JPT), 45 Han Chinese in Beijing China (CHB) and 60 CEPH (Utah Residents with Northern and Western European Ancestry)(CEU) 7. The study confirmed that alleles of neighbouring SNPs are often highly correlated within a population of unrelated individuals, and that specific ‘tagSNPs’ 8 can be selected to serve as proxies for other SNPs in high LD, thus substantially reducing the number of SNPs that need to be genotyped in order to recover most of the information about common variation. The HapMap project provided a map of the LD structure across the genome, together with general statistics for markers such as location and allele frequencies, which allow investigators to select tagSNPs that best capture the common genetic variation in a particular region, or across the genome. HapMap Phase II increased the number of SNPs genotyped in the four