Specific implementation issues can be ruled out as the cause of the power difference since the PLINK and VEGAS analyses yield results highly similar to their matched MAGMA models (S9 Fig), and using the pruning option in PLINK also has little effect on the overall results. This means that the difference must be due to the difference in the methods and test-statistics themselves. Comparing the MAGMA implementations of these models in Fig 1, the mean χ 2 and top χ 2 approaches are shown to produce very similar p-values. Moreover, the plots reveal that the superior power of the MAGMA-main model does not arise from consistently lower gene p-values, but rather from a small set of genes with low p-values for MAGMA-main that are simply not picked up by the other approaches. This is likely to be related to the way LD between SNPs is handled, as that is one of the key differences between the multiple regression model of MAGMA-main and all the others. A post-hoc power simulation indeed indicates that multi-marker effects with weak marginals are the most probable explanation (see ‘Supplemental Methods—Simulation Studies’).