To compare the calibration and power of LCV with existing causal inference methods, we performed a wide range of null and causal simulations involving simulated summary statistics with no LD. We compared four main methods: LCV, random-effect two-sample MR[5,9] (denoted MR), MR-Egger[7], and Bidirectional MR[11] (see Methods). We also compared with the weighted median estimator (MR-WME)[8] and mode-based estimator (MR-MBE)[10] (whose performance was roughly similar to MR and to MR-Egger respectively; results using these methods are reported in supplementary tables). We applied each method to simulated GWAS summary statistics (N=100k individuals in each of two non-overlapping cohorts; M=50k independent SNPs[20]) for two heritable traits (h2=0.3), generated under the LCV model. LCV uses LD score regression [19]; for simulations with no LD, we use constrained-intercept LD score regression (simulations with LD are described below). A detailed description of all simulations is provided in the Supplementary Note, and simulation parameters are described in Supplementary Table 1.