As MASSIVE includes a fairly sparse set of diagnoses (not all ICD codes are available) for genetic correlation analyses, we conducted additional and theoretically relevant PheWASs using the addiction-rf PRS. We used electronic health records (EHR) data for 66,914 genotyped individuals of European-ancestry from the Vanderbilt University Medical Center biobank (BioVU)30. BioVU is a repository of leftover blood samples (~240,000 samples) from clinical testing, that are sequenced, de-identified, and linked to clinical and demographic data. Genotyping and quality control of this sample have been described elsewhere30. The addiction-rf PRS was used to predict 1,335 diseases in a logistic regression model, controlling for median age on record, reported gender, and first 10 genetic ancestry PCs. For an individual to be considered a case, they were required to have two separate ICD codes for the index phenotype, and each phenotype needed at least 100 cases to be included in the analysis. A Bonferroni-corrected phenome-wide significance threshold of 0.05/1335=3.7E-05 was used73.