information on Gene ID, gene names, and their start and end position on a chromosome. For the purpose of identification of SNPs in genes we mapped all the SNPs to genes defined by the start and end positions using database techniques. The resulting output file provided information on SNPs for chromosomes 1 to 22 and the genes in which they are placed. From the chromosome reports data, only reference sequence entries were used. The entries for ‘Celera’ sequence were ignored. In the seq_gene.md file also, only reference sequence entries for genes with Taxonomic ID of 9606 were used. The entries for ‘Celera’ sequence and entries of gene types such as ‘PSEUDO’, ‘CDS’, ‘RNA’ and ‘UTR’ were also ignored from this file.