Across all models, problematic tobacco use had the lowest loadings on its respective factors (ƛ ranged from 0.32 to 0.34 in the common factor, EXT-ARF and BD-SUB models) and cannabis use disorder had the highest loadings (ƛ ranged from 0.94 to 1.02 in the common factor, EXT-ARF and BD-SUB models). We tested a series of models dropping the problematic tobacco use indicators to test the impact this indicator had on model fit and found that model fit was not substantially impacted (see Table S4).