Chunk #38 — ONLINE METHODS — Clumping algorithm.

Source: Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals.
Embedded: yes

Text

First, the SNP with the smallest P value in the pooled meta-analysis results is identified as the lead SNP of the first clump. Next, all SNPs in LD with the lead SNP are also assigned to this clump. SNPs are defined to be in LD with each other if they are on the same chromosome and the squared correlation of their genotypes is r2 > 0.1. To determine the second lead SNP and second clump, the first clump is removed, and the same steps are applied to the remaining SNPs. The process is repeated until no SNPs with P value below 5×10−8 remain. Each locus is defined by a lead SNP and the SNPs assigned to its clump. Hence, each lead SNP maps to exactly one locus, and each locus maps to exactly one lead SNP.