Chunk #22 — Materials and Methods — Training and validation data sets for predicting height in Africans: UK Biobank and N'Diaye et al

Source: Multiethnic polygenic risk scores improve risk prediction in diverse populations.
Embedded: yes

Text

Our analyses of height in Africans used European summary association statistics from UK Biobank (see Web Resources), African summary statistics from N'Diaye et al. (N'Diaye et al., 2011) and African genotypes and phenotypes from UK Biobank (row 6 of Table 1). European summary statistics from UK Biobank were computed using 113,660 British samples for which height phenotypes were available with adjustment for 10 PCs (Galinsky, Loh, et al., 2016), estimated using FastPCA (Galinsky, Bhatia, et al., 2016) (see Web Resources). The N'Diaye et al. (N'Diaye et al., 2011) data set consists of 20,427 samples of African ancestry with summary association statistics at 3,254,125 imputed SNPs. The UK Biobank data set consists of 1,745 unrelated samples of African ancestry, genotyped at 608,878 SNPs after QC, with the following self-reported ethnicity distribution: 743 African, 1,002 Caribbean. We removed one individual from each pair of relatives with relatedness greater than 20% (n=32). We performed a PCA analysis using EIGENSTRAT (Price et al., 2006) (see Web Resources) to identify and remove genetic outliers, but did not identify any outliers. We restricted our analysis to