paperKB
coga / coga-kb
Help
Sign in

Chunk #17 — Materials and Methods — Gene-set analysis

Source
MAGMA: generalized gene-set analysis of GWAS data.
Embedded
yes

Text

One complication that arises in this gene-level regression framework is that the standard linear regression model assumes that the error terms have independent normal distributions, i.e.ε~MVN(0→,σ2I). However, due to LD, neighbouring genes will generally be correlated, violating this assumption. This issue can be addressed by using Generalized Least Squares approach instead, and assuming that ε~MVN(0→,σ2R). In MAGMA, the required gene-gene correlation matrix R is approximated by using the correlations between the model sum of squares (SSM) of each pair of genes from the gene analysis multiple regression model, under their joint null hypothesis of no association. These correlations are a function of the correlations between the SNPs in each pair of genes and thus provide a good reflection of the LD, and since they have a convenient closed-form solution they are easy to compute (see also ‘Supplemental Methods—Implementation Details’). Note that for the self-contained analysis, the submatrix R s corresponding to only the genes in the gene set is used instead of R. In addition, since the self-contained null hypothesis guarantees that all z g have a standard normal distribution, the error variance σ 2 can be set to 1.