Chunk #57 — Method — Standardization and Scaling of Summary Statistics for Multivariate GWAS

Source: Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits.
Embedded: yes

Text

very closely approximate logistic regression coefficients and SEs that are amenable for use in Genomic SEM.49 This approximation can be obtained as Z=bSNP,P∗∗SEbSNP,P∗∗, blogitSNP,P∗=Zν(1−ν)NσSNP2, and SEbSNP,P=blogitSNP,P∗Z, where bSNP,P∗∗ is equal to the regression coefficient from the linear probability model, blogitSNP,P∗ is the expected logistic regression coefficient that is derived from the linear probability model results, v is equal to the proportion of cases in the sample, and σSNP2 is the variance of the SNP, computed from its MAF obtained from a reference sample, as per above. To scale the derived logistic coefficient such that it is scaled relative to unit-variance scaled liability, the coefficient should be divided by σSNP2×(blogitSNP,P∗)2+π23. Lloyd-Jones et al. (2018)49 report that in a real data analysis of UKB data, the exponentiated regression coefficient (i.e., the odds ratio) obtained directly from a logistic regression-based GWAS and that derived from the linear probability model-based GWAS was nearly perfect (R2 > 98%, slope ≈ 1). We have verified this nearly perfect correspondence in our own simulations (Supplemental Figure 28).