Chunk #11 — Methods — Adaptive Rank Truncated Product of SNP Association

Source: SNP-based pathway enrichment analysis for genome-wide association studies.
Embedded: yes

Text

The computation needed for selecting representative SNPs for genes involves hundreds of permutations of thousands of subjects, recalculating the test statistic in each permutation based on about half a million SNPs, and testing on multiple values of the cutoff (i.e. threshold) point K. One way to limit the computational effort is to set the upper limit Kupper to 10 for the truncation point K. To further reduce the computational cost, we discard SNPs with large nominal P-values. On the other hand, if too few SNPs are selected, we might miss SNPs have low or moderate individual effects but jointly show a moderate or large effect. To seek a balance, we set a nominal threshold that is generous, say 0.05, i.e, only SNPs with P-values less than or equal to 0.05 will be selected. However, if none of the SNPs for a gene passes the threshold, the smallest SNP would be selected to avoid missing too many genes in pathway analysis. Both Kupper and P-value thresholds are changeable in our software; other values can be used depending on the situation. In our experiment, we found that 10 as Kupper and 0.05 as the P-value threshold are useful choices.