paperKB
coga / coga-kb
Help
Sign in

Chunk #85 — Methods — Genetic control of ancestry effects on expression — Calculating predicted expression using genetic variants in an elastic net model

Source
Analysis of gene expression in the postmortem brain of neurotypical Black Americans reveals contributions of genetic ancestry.
Embedded
yes

Text

We selected all genetic variants within ±500 kb of the gene body. We removed variants with missing genotypes and filtered variants based on an MAF threshold of 0.01 and a Hardy–Weinberg equilibrium below a P value of 1 × 10−5. We used an elastic net model, ideal for relatively smaller sample sizes. For our elastic net model, we fitted a sparse linear regression model using big_spLinReg from the bigstatsr R package (v.1.5.12)99. We tuned the alpha parameter using a sequence of 20 alphas (that is, 0.05–1 using a 0.05 step size). Additionally, we used four sets for the cross-model selection and averaging procedure. We averaged feature weights for genetic variants across k-folds (five folds for each of the caudate nucleus, DLPFC and hippocampus; and three folds for the dentate gyrus). We imputed residualized expression with these feature weights (i) and an individual’s genotype dosage (j) (equation (10)). We calculated the correlation coefficient (r) using Pearson correlation on the test samples for each k-fold:10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{Predicted}}\,{\mathrm{expression}}_{i}=\sum_j{\mathrm{variant}}\,{\mathrm{weight}}_{j} \times {\mathrm{genotype}}_{j}$$\end{document}Predictedexpressioni= ∑jvariantweightj×genotypej