paperKB
coga / coga-kb
Help
Sign in

Chunk #20 — Methodological issues — Assessing statistical significance

Source
Gene set analysis of genome-wide association studies: methodological issues and perspectives.
Embedded
yes

Text

First, a typical GWAS measures a half million or more SNPs on hundreds or even thousands of samples. The recalculation of a gene set score for each permutation is extremely computationally intensive, especially for competitive tests based on markers from the entire genome. To reduce the amount of computation, several researchers explored assessing gene set significance by resampling genes [31] or SNPs [32,51]. It has been suggested that apart from genomic regions that exhibit long range LD (e.g., the Major Histocompatibility Complex (MHC) region), SNPs located on different genes may have little LD [31,33]. Another permutation scheme introduced recently is restandardization, which combines sample label permutation and gene re-sampling [36,52]. The idea of restandardization is that, while permuting sample labels preserves the correlation structure between genes, the null distribution based on sample permutation approximates the theoretical null distribution (0,1) [53]. However, this distribution ignores the empirical mean and standard deviation of the gene set statistic, which can be approximated more closely by resampling genes. Therefore, for each sample permutation, the mean and standard deviation from gene resampling are used to