In this paper, we use LD scores computed from an out-of-sample reference panel. To evaluate this, we used the summary statistics simulated above, but ran stratified LD score regression using a 1000G reference panel rather than in-sample LD. We found that estimates of total hg2 and category-specific hg2 were biased downwards, but that estimates of proportion of hg2 were approximately unbiased and type 1 error was well calibrated (Supplementary Figure 4).