Chunk #1 — INTRODUCTION

Source: LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.
Embedded: yes

Text

Under a polygenic model, such that effect sizes are drawn independently from distributions with variance proportional to p(1-p)−1/2 where p is minor allele frequency (MAF), then the expected χ2-statistic of variant j is (1)E[χ2|ℓj]=Nh2ℓj/M+Na+1, where N is sample size; M is the number of SNPs, such that h2/M is the average heritability explained per SNP; a measures the contribution of confounding biases, such as cryptic relatedness and population stratification; and ℓj≔Σkrjk2 is the LD Score of variant j, which measures the amount of genetic variation tagged by j and (a full derivation of this equation is provided in the Supplementary Note). This relationship holds for meta-analyses, and also for ascertained studies of binary phenotypes, in which case h2 is on the observed scale. Consequently, if we regress χ2statistics from GWAS against LD Score (LD Score regression), the intercept minus one is an estimator of the mean contribution of confounding bias to the inflation in the test statistics.