Ancestry PCs can be estimated from the sample itself or from an external reference such as the 1KGP and projected onto the GWAS sample. The use of an external reference panel has advantages such as not needing to exclude relatives or poorly performing samples, and some of the loadings can be interpreted (i.e., what ancestral population it reflects) based on reference panel populations. Here, 1KGP phase 3 variants (2504 samples, 26 populations), found in common with the post QC filtered S4S genotypes were merged together. Regions with high LD were excluded27,28 and the common set of variants was then pruned (r2 < 0.1) using PLINK 1.929,30 (–indep-pairwise 1500 150 0.1) to yield 109,259 semi-independent variants for ancestry analyses. EIGENSOFT and SmartPCA27,31 were used to perform PCA using only the 1KGP phase 3 reference panel to determine SNP weights for each eigenvector. This solution was then projected onto the S4S data to generate 10 PCs.