Altogether, these analyses show that increasing the GWAS sample size not only increases the prediction accuracy, but also sheds more light on the genomic distribution of causal variants and, at all but the largest sample sizes, the genes proximal to these variants. By contrast, enrichment of higher-level, broadly defined biological categories such as gene sets and pathways and functional annotations can be identified using relatively small sample sizes (n ≈ 0.25 million for height). Of note, we confirm that increased genetic diversity in GWAS discovery samples significantly improves the prediction accuracy of PGSs in under-represented ancestries.