and set the clinical/interview phenotype as the primary phenotype by fixing its factor loading to 1 and its residual variance to 0. This factor model was consistent with the data (χ32=4.49, p = 0.213); therefore, we could not reject the null hypothesis that a single factor capturing all the variance of the primary method explained the intercorrelations between the other depression phenotypes. Most MD phenotypes had strong positive loadings on the common factor (clinical/interview = 1.0 [reference], EHR = 0.92 ± 0.04, questionnaire = 0.95 ± 0.04), although the loading for self-reported diagnosis was lower (self-report loading = 0.85 ± 04). One locus showed significant SNP heterogeneity between phenotyping definitions (rs12124523 intronic variant in NEGR1, common factor association p = 8.4 × 10−14, Q heterogeneity p = 2.9 × 10−10, I2 = 0.71) with a stronger association found in self-reported depression studies (self-report odds ratio [OR] = 1.081, confidence interval [CI] = 1.065–1.098; other cohorts OR = 1.008, CI = 0.999–1.018). We found no evidence of heterogeneity at 569/570 loci, supporting the use of multiple phenotypes in genetic association studies of MD.