A typical used measure of heterogeneity is Cochran’s Q, which tries to answer the question on whether there is statistically significant heterogeneity or not. It is calculated as the weighted sum of squared differences between individual study effects and the summary effect across studies. Q is distributed as a chi-square statistic with k-1 (k=number of studies) degrees of freedom. When the number of studies combined is small, the test has low power to detect heterogeneity if it is present. Conversely, if the number of studies is large, the test is likely to detect significant heterogeneity, even if the absolute magnitude of the variability is unimportant. As in any underpowered statistical test, a positive result does not ensure that heterogeneity is present indeed, and a negative result does not really exclude the possibility that heterogeneity exists. Acknowledging these caveats, the traditional threshold of claiming significant heterogeneity based on Q is p<0.10. It is not appropriate to adjust this for the number of SNPs tested as is the case for association statistics. With such improper adjustment, the power to detect significant heterogeneity vanishes.