paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #41 — METHODS — Simulations. — Genotypes:

Source
Improving polygenic prediction in ancestrally diverse populations.
Embedded
yes

Text

1KG reference panels. We note, however, that while highly scalable, genotypes simulated by HAPGEN2 may not fully capture the complex population structure within and across ancestry groups. We saved 20K samples for each of the three populations as the target dataset, which was evenly split into validation and testing datasets. The remaining samples served as the discovery dataset, which was used to produce GWAS of varying sample sizes. We constrained the simulations to 1,296,253 HapMap3 variants with MAF >1% in at least one of the EUR, EAS and AFR populations, and removed triallelic and strand ambiguous variants.