Users of our method will still have to make two choices: how to convert SNP p-values to gene scores (max or sum gene scores), and how to transform gene scores into pathway scores (empirical or chi-squared). We do not see evidence that one gene scoring method systematically outperforms the other in the context of our chi-squared pathway scoring method, while there seems to be a better performance for sum gene score when using the empirical approach (S8 Fig). To investigate this phenomenon we winsorized p-values (i.e. extreme p-values below 10−12 were set to 10−12) and saw that the max gene score combined with empirical sampling suffered far less performance loss (S9 Fig). We therefore conclude that the power loss is due to outlier gene scores. The max gene-scores can lead to very high gene scores for high-powered studies. In the extreme case one gene might reach scores so high that it precludes detection of pathways not containing that gene when the empirical sampling strategy is used.