Chunk #13 — Methods — eMERGE genetic data

Source: Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations.
Embedded: yes

Text

super-populations—European [EUR], African [AFR], Admixed American [AMR], and East Asian [EAS]—by co-clustering the projected eMERGE samples with the 1KG reference samples. Continental ancestry memberships were verified by visual inspection of the PC plots (Additional File 1: Fig. S1). We further intersected genetically inferred ancestry with self-reported race/ethnicity, namely White and non-Hispanic/Latino, Black or African American, Hispanic or Latino, and Asian, for the four ancestral groups, respectively, and randomly removed one sample from each pair of related individuals (kingship coefficient >0.1), leaving 54,793 European, 12,472 African, 2,374 Hispanic/Latino and 557 East Asian individuals with T2D case and control definitions (Table 1; Additional File 2: Table S1). We did not use Asian samples in subsequent PRS analyses due to the small sample size. Variants with minor allele frequency (MAF) <1% within each population were excluded.Table 1Sample characteristics of the evaluation datasetsAge(Mean ± SD)Sex(Female %)N caseN controleMERGEEuropean59.4 ± 23.251.3%838946,404African45.8 ± 22.960.0%26889784Hispanic/Latino56.3 ± 20.660.5%8681506UAB Black CohortsREGARDS63.8 ± 9.360.5%16595086GenHAT66.1 ± 7.555.3%27762722HyperGEN47.0 ± 12.863.5%4021494WPC57.5 ± 15.257.6%300355Taiwan BiobankBatch 148.9 ± 11.149.3%124823,862Batch 250.5 ± 10.568.6%280651,272Batch 349.3 ± 10.965.7%5169862