Chunk #19 — Materials and Methods — Training and validation data sets for predicting type 2 diabetes in Latinos: DIAGRAM, SIGMA and UK Biobank

Source: Multiethnic polygenic risk scores improve risk prediction in diverse populations.
Embedded: yes

Text

SIGMA association statistics were computed with adjustment for 2 PCs, as in ref. (SIGMA Type 2 Diabetes Consortium et al., 2014). We restricted our analyses of type 2 diabetes to 776,374 SNPs present in both data sets (with matched reference and variant alleles) after removing A/T and C/G SNPs to eliminate potential strand ambiguity. For the SIGMA data set, we used the top 2 PCs as computed in ref. (SIGMA Type 2 Diabetes Consortium et al., 2014). We also performed an analysis of type 2 diabetes using imputed genotypes from the SIGMA T2D data set (SIGMA Type 2 Diabetes Consortium et al., 2014), restricting to 2,062,617 SNPs present in both data sets (with matched reference and variant alleles) after removing A/T and C/G SNPs to eliminate potential strand ambiguity.