One possible cause for instability of the results in genetic association studies is genetic heterogeneity, in which different variants may account for disease status or trait level in different patients. To address this problem, several investigators have hypothesized that results from testing gene sets rather than from individual markers would be more stable across different samples in the population and, thus, easier to replicate [31,32,51,70]. More studies are needed to evaluate and test this hypothesis, which has already been validated in gene expression studies [71]. Note that replication and stability assessments are most meaningful when type I error rate for a method is preserved, so applying a method with severe downward biased P-values to two datasets would not constitute a valid replication [72].