result by conducting bivariate analyses considering cases and controls from one subset as trait 1 and those from a different subset as trait 2 (Table 2); the two independent subsets are related through the coefficients of genome-wide similarity calculated from SNPs between individuals (Online Methods equation 2). The estimated correlation coefficients based on SNP genome-wide similarities are less than 1, consistent with several explanations. Subsets may be more homogeneous both phenotypically, for example because of similar and consistent diagnostic criteria, and genetically, because linkage disequilibrium (LD) between causal variants and analysed SNPs may be higher within than between subsets. Alternatively, subtle artefacts could generate non-random differences in allele frequencies between sets of cases and sets of controls from the same study. However, our preliminary analyses using genotyped SNPs for ISC and MGS and extreme QC (Supplementary Table 2) suggest that this is unlikely to be a major contributor. Furthermore, the correlations between data sets from the bivariate analyses are high (~0.8) demonstrating that the same genetic signals can explain variance in schizophrenia liability in different case-control samples collected; given that these samples were collected independently with genotyping conducted at different laboratories, it is difficult to envision artefacts that could generate