Chunk #93 — Methods — Differential methylation and contribution to ancestry differential expression — Evaluating the environmental impact of global ancestry-associated DEGs
To evaluate the impact of unknown environmental factors on global ancestry-associated DEGs, we first annotated the VMRs using annotate_regions and the basic gene hg38 annotation from the R/Bioconduction package annotatr (v.1.24.0)102, after converting to genomic ranges with plyranges. After annotation, we estimated PST18. PST is essentially the partial coefficient of determination. As such, we estimated the PST statistic for each gene with equation (11). We calculated the PST statistics for ancestry before and after including the residualized VMRs annotated to an ancestry-associated DEG. The residual was derived from the raw DNA methylation levels of each VMR by regressing out known biological factors (local ancestry, age, sex), as well as potential batch effects and other unknown biological factors captured by the top five principal components of DNA methylation levels. After this, we calculated ΔPST to extract the fraction of change associated with the environment (equation (12)):11\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{{\mathrm{partial}}}^{2}=\frac{{\mathrm{SSE}}\left({\mathrm{reduced}}\right)-{\mathrm{SSE}}\left({\mathrm{full}}\right)}{{\mathrm{SSE}}\left({\mathrm{reduced}}\right)}$$\end{document}Rpartial2=SSEreduced−SSEfullSSEreduced12\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta {P}_{{\mathrm{ST}}}=\frac{{P_{\mathrm{ST}}}-{P_{\mathrm{ST}}}\,_{{\mathrm{VMR}}}}{{P_{\mathrm{ST}}}}$$\end{document}ΔPST=PST−PSTVMRPST