Adverse drug reactions present a specific problem in ascertainment, may require manual case and control curation, and highlight the need for very large datasets to identify what are often rare events. Metrics of drug elimination, such as estimated glomerular filtration rate, are often available and can be used to assess contributions of renal function to variable outcomes. Similarly, data can be obtained on effects of co-administration of a study drug and known inhibitors of specific pathways of drug elimination. Temporal features are especially important for most drug response algorithms. For example, temporal elements, including those extracted with NLP, were important to identify methotrexate-induced liver injury in an algorithm that performed with a PPV of 59%.14 For such complex, rare phenotypes, an approach that captures all possible cases and then subjects them to manual curation and review by expert clinicians may be required. Kawai et al ascertained 250 cases of bleeding during long term warfarin therapy and 250 controls receiving long term warfarin therapy without bleeding in BioVU.32 A candidate gene analysis (CYP2C9, CYP4F2, and VKORC1) identified CYP2C9*3 as a risk