paperKB
coga / coga-kb
Help
Sign in

Chunk #28 — ONLINE METHODS — Regression Weights

Source
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.
Embedded
yes

Text

The statistically optimal solution to the correlation problem is to perform generalized lease squares (GLS) with the variance-covariance matrix of χ2-statistics. However, this matrix is intractable under our model. As an approximation, we correct for correlation by weighting variant j by the reciprocal of the LD Score of variant j counting LD only with other SNPs included in the regression. Precisely, if we let S denote the set of variants included in the LD Score regression then the LD Score of variant j counting LD only with other SNPs included in the regression is ℓj(S)≔1+∑k∈Srjk2 Weighting by 1/lj(S) would be equivalent to GLS with the full variance-covariance matrix of χ2-statistics if the genome consisted of LD blocks and r2 (in the population) was either zero or one. We estimate lj(S) for the set of variants S described in the section Application to Real Data using the same procedure we used to estimate the full 1000 Genomes LD Score. Since our estimates of l̂j can be negative and regression weights must be positive, we weight by 1/max(l̂j,1).