Chunk #58 — STAR* METHODS — QUANTIFICATION AND STATISTICAL ANALYSIS — Examination of the Impact of Sample Size Imbalance on Genetic Correlations and Genomic SEM Results
To further evaluate whether sample size imbalance across the eight disorders biased the number of pleiotropic signals we observed, we conducted simulation studies of UK Biobank data. In particular, we examine whether the number of pleiotropic loci we identified exceeds chance expectation given the sample size and genetic correlations among the eight disorders. We used the full release of 488,377 UK Biobank (UKBB; (Sudlow et al., 2015)) individual data, imputed with the Haplotype Reference Consortium (HRC), UK 10K, and 1000 Genomes reference panels (under the application number 31063). Data was QC’ed as described in the Neale Lab UK BIOBANK GWAS webpage (http://www.nealelab.is/uk-biobank/), including 361,194 unrelated individuals of Caucasian ancestry and 13.7 million genetic variants (MAF > 0.0001, INFO > 0.8). For the purpose of the simulation, we removed individuals who were in the UKBB interim release to avoid sample overlap with the MD GWAS where these subjects were included (Wray et al., 2018) and restricted the analysis to variants present in both the current study (PGC-CDG2) and the UKBB datasets, resulting in 6,691,733 SNPs.