Chunk #53 — Materials and Methods — Algorithmic details for gene-score calculations — Max-score

Source: Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics.
Embedded: yes

Text

The algorithm first tries to use Monte Carlo simulation to derive p-values. Should the p-value be too small to be estimated within a few Monte Carlo draws, the procedure makes use of an algorithm for rectangular multivariate normal integration[41]. The implementation of the integration algorithm that is used is suitable to estimate p-values larger than 10−15. In addition, this implementation is limited to correlation matrices of size below 1000 due to numerical stability concerns. Therefore, SNPs that are in very high LD (r2 > 0.98) are pruned to lower the size of the correlation matrix. If more than 1000 SNPs fall into the gene or the gene-wise p-value is below 10−15, we approximate the gene score by multiplying the minimal SNP-wise p-value in the gene region by the effective number of tests. The effective number of tests is calculated as the minimum number of principal components needed to explain 99.5% of total variance[46].