paperKB
coga / coga-kb
Help
Sign in

Chunk #32 — ONLINE METHODS — Simulations

Source
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.
Embedded
yes

Text

It is difficult to use real genotypes to simulate ascertained studies of a binary phenotype with low population prevalence: to obtain 1000 cases with a simulated 1% phenotype, one would need to sample on expectation 100,000 genotypes, which is not feasible. We therefore generated simulated genotypes at 1.1 million SNPs with mean LD Score 110 and a simplified LD structure where r2 is either 0 or 1, and all variants had 50% minor allele frequency. We generated phenotypes under the liability threshold model with all per-normalized genotype effect sizes (i.e., effects on liability) drawn i.i.d. from a normal distribution, then sampled individuals at random from the simulated population until the desired number of cases and controls for the study had been reached. The R script that performs these simulations is available online (URLs).