Prospectively-acquired datasets such as the UK Biobank or community cohorts such as Framingham or Jackson Heart and others acquire data in a predefined and comprehensive fashion. By contrast, the EHR represents the range of conditions and follow-up for which patients seek medical attention. Unlike prospective cohorts, the EHR does not provide unambiguous cases and controls. Rather, as outlined above, algorithms using a variety of different data sources need to be developed and validated.