We calculated genetic correlations between our five phenotypes and 4,065 UK Biobank phenotypes (both restricted to EUR ancestry) using bivariate LDSC with 1000 Genomes-based pre-calculated EUR LD scores for HapMap3 variants. We excluded phenotypes with heritability Z-scores less than 3 (reflecting near-zero heritability), genetic correlations with our phenotypes less than −0.8 or greater than 0.8, to remove phenotypes approaching redundancy with our target tobacco and alcohol use measures (for example, cigarettes per day versus packs per day), and those whose genetic correlations were unable to be estimated largely due to negative heritability estimates, leaving 1,141 UK Biobank phenotypes. Affinity propagation clustering62, a message-passing algorithm based on exemplars that identifies their corresponding set of clusters, was then used to further interpret the pattern of genetic correlations and multifactorial nature of substance use. A Bonferroni-corrected P value threshold for 1,141 UK Biobank phenotypes was used to identify genetic correlations that were significantly different from zero.