To create the GRSs we randomly split the UK Biobank (UKB) British white dataset (n = 407,388) into a derivation (n = 11,995) and validation set (n = 395,393; “Methods” section Fig. 1, Table 1). In order to increase statistical power in the derivation phase, we enriched the derivation set (n = 11,995) with IS events (n = 888, 7.4%). A schematic of the overall study design is given in Fig. 1.Table 1Study characteristics of the UK Biobank validation dataset.Baseline characteristicUK Biobank N = 395,393Male N = 180,653 (45.7%)Female N = 214,740 (54.3%)Age, years [mean (sd)]56.9 (8.0)57.1 (8.1)56.7 (7.9)Current smoker, N (%)39,804 (10.0%)21,261 (11.8%)18,543 (8.6%)Systolic blood pressure, mm Hg [mean (sd)] (adjusted for BP medication)143.3 (21.7)146.9 (20.4)140.2 (22.2)Diabetes diagnosed by doctor, N (%)18,675 (4.7%)11,449 (6.3%)7226 (3.4%)Hypertension, N (%)211,069 (53.4%)110,540 (61.2%)100,529 (46.8%)Family history of stroke, 1st degree relative, N (%)104,831 (26.5%)45,569 (25.2%)59,262 (27.6%)High cholesterol, N (%)53,141 (13.4%)30,670 (17.0%)22,471 (10.5%)Prevalent stroke events, N (%), any stroke before age 754543 (1.1%)2679 (1.5%)1864 (0.9%)Prevalent stroke events, N (%), ischaemic stroke before age 751152 (0.3%)787 (0.4%)365 (0.2%)Incident stroke events, N (%), any stroke before age