Similarly, given that GWA studies necessarily involve multiple statistical tests, stringent levels of significance are required which may hinder the replication of results. At a nominal p-value of 0.05, a GWA study examining 500,000 SNPs may potentially result in 25,000 false positives. For this reason, a genome-wide statistical significance of p < 1 × 10−7 is typically used; however, this results in very large sample sizes required in order to have sufficient statistical power. Thus, to address the potentially high number of false positive results, it is particularly vital to replicate early results in independent samples, and this is now often included in the same GWA study as part of a multi-stage design. However, replication of initial findings demonstrating similar magnitude and direction of effect within the same or similar phenotype and population is often not observed, complicating interpretation of results (11). It is notable however that the associations can be quite robust and replicable, such as the association of the CHRNA5-A3-B4 gene cluster with smoking phenotypes (12).