RNA alternative splicing impacts the risk for alcohol use disorder.
- Authors
- Li, Rudong; Reiter, Jill L; Chen, Andy B; Chen, Steven X; Foroud, Tatiana; Edenberg, Howard J; Lai, Dongbing; Liu, Yunlong
- Year
- 2023
- Journal
- Molecular psychiatry
- PMID
- 37217680
- DOI
- 10.1038/s41380-023-02111-1
- PMCID
- PMC10615768
Alcohol use disorder (AUD) is a complex genetic disorder characterized by problems arising from excessive alcohol consumption. Identifying functional genetic variations that contribute to risk for AUD is a major goal. Alternative splicing of RNA mediates the flow of genetic information from DNA to gene expression and expands proteome diversity. We asked whether alternative splicing could be a risk factor for AUD. Herein, we used a Mendelian randomization (MR)-based approach to identify skipped exons (the predominant splicing event in brain) that contribute to AUD risk. Genotypes and RNA-seq data from the CommonMind Consortium were used as the training dataset to develop predictive models linking individual genotypes to exon skipping in the prefrontal cortex. We applied these models to data from the Collaborative Studies on Genetics of Alcoholism to examine the association between the imputed cis-regulated splicing outcome and the AUD-related traits. We identified 27 exon skipping events that were predicted to affect AUD risk; six of these were replicated in the Australian Twin-family Study of Alcohol Use Disorder. Their host genes are DRC1, ELOVL7, LINC00665, NSUN4, SRRM2 and TBC1D5. The genes downstream of these splicing events are enriched in neuroimmune pathways. The MR-inferred impacts of the ELOVL7 skipped exon on AUD risk was further supported in four additional large-scale genome-wide association studies. Additionally, this exon contributed to changes of gray matter volumes in multiple brain regions, including the visual cortex known to be involved in AUD. In conclusion, this study provides strong evidence that RNA alternative splicing impacts the susceptibility to AUD and adds new information on AUD-relevant genes and pathways. Our framework is also applicable to other types of splicing events and to other complex genetic disorders.
Predictive modeling for the genetic component of skipped exon (SE) events.A Modeling workflow. CommonMind Consortium (CMC) RNA-seq and genotyping data from dorsolateral prefrontal cortex (DLPFC) were used to derive the splicing outcomes (PSI, Ψ) and imputed genotypes (GT), respectively. These data were filtered before training the elastic net (EN) model that was used to compute the genetically determined component of Ψ, denoted as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^. The models were evaluated by leave-one-out cross validation, Genome-wide Complex Trait Analysis (GCTA) and a replication RNA-seq dataset from the New South Wales Brain Tissue Resource Center (NSWBTRC). MIS, Michigan imputation Server. B Quantile-quantile (Q-Q) plot of leave-one-out. Observed significance (-log10 P value, black dots, n = 6284 SE) of the Pearson’s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{{{{{\boldsymbol{r}}}}}}}}\left( {{{{\hat{\mathrm \Psi }}}},{{\Psi }}} \right)$$\end{document}rΨ^,Ψ against a random null distribution (red line) in CMC. C Example of a highly cis-regulated splicing event. The genetically predicted PSI (y-axis) is plotted versus the total PSI derived from RNA-seq (x-axis) for a specific SE (ENSE00000707111) in NMRK1 using the CMC samples (black dots, n = 380). Pearson’s r and its P value; and the R2, proportion of PSI variance explained by the model are provided. Solid red line represents the correlation. Dashed blue line is the identity line. D Example of a lowly cis-regulated splicing event. The genetically predicted PSI (y-axis) is plotted versus the total PSI derived from RNA-seq (x-axis) for a specific SE (ENSE00001930700) in SYNGAP1 using the CMC samples (black dots, n = 380). Pearson’s r and its P value, and the R2 are provided. Solid red line represents the correlation. Dashed blue line is the identity line. E Model evaluation by heritability analysis. The finalized elastic net models (1093 SE) were evaluated by the independent heritability analysis approach of GCTA. Results shown are the model prediction (R2, red dots), GCTA evaluation (h2, blue dots in ascending order), and 95% confidence interval (CI) of h2 (gray dashes), for each event. For 87.5% of the splicing events, our predicted R2 lies between the lower and upper bounds of the GCTA estimation h2, indicating that the model prediction is consistent with genome-wide estimation. F Model validation on replication cohort. The same elastic net models as in (E) were validated on the NSWBTRC RNA-seq cohort, which contains different individuals from CMC. Leave-one-out Pearson’s r from our models are shown (gray dots, ascending order), in which the eligible events for comparison in the replication cohort are highlighted in blue (570 SE). The replication Pearson’s r are shown as purple triangles. Distribution of the replication r is visualized by the marginal histogram, where 75.4% of the events had a r > 0 (yellow horizontal line) indicating the success of replication. G Quantile-quantile (Q-Q) plot based on the replication cohort. Observed significance (-log10 P value, black dots, n = 570 SE) of the Pearson’s r against a random null distribution (red line) in the COGA RNA-seq cohort.
LLM interpretation
This figure presents a predictive modeling workflow and evaluation for the genetic component of skipped exon (SE) events. It includes a flow diagram (A) of the data processing and elastic net model, Q-Q plots (B, G) showing observed versus expected significance, and scatter plots (C, D) comparing predicted versus RNA-seq derived PSI for highly and lowly cis-regulated events. Additionally, the figure shows model evaluation via GCTA heritability analysis (E) and validation on a replication cohort (F), where Pearson's $r$ values are compared across datasets.
Mendelian randomization (MR)-based analysis of COGA.A Overview of the MR-based analysis. Genetic variant (X) is the instrumental variable. The intermediate molecular trait (i.e., exposure) is the genetically predicted PSI, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}_{\left( x \right)}$$\end{document}Ψ^x, for RNA splicing, and the phenotypic variable is the trait (Y). \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^ is inferred from X using the elastic net (EN) models and the association between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^ and Y is evaluated by generalized estimating equation (GEE). Splicing events showing significant associations with the trait Y are putatively causal for the trait. This MR pipeline was run in the discovery cohort COGA and repeated in the replication cohort OZ-ALC. The number of subjects for each phenotype is provided. B, C Manhattan plots of significant splicing events. Chromosomal distribution of significance of association for all splicing events (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^, Y), with respect to the DSM-IV (B) and SXCT (C). Blue line, -log10 of the p-value corresponding to FDR = 0.05; green dots, significant events in discovery cohort; red dots, replicated significant events. D, E Effect sizes of the replicated events. Forest plots of the effect sizes of the six replicated events. The estimates of effect sizes (beta) in COGA (D) and OZ-ALC (E) are consistent. The rectangles are the estimates and hashed lines represent the 95% CI.
LLM interpretation
This figure illustrates a Mendelian randomization (MR) pipeline and its results for analyzing the causal relationship between RNA splicing and specific traits. Panel A is a flow diagram showing the process from genotype (X) to predicted PSI ($\Psi$) via an elastic net model, and finally to the trait (Y) via a generalized estimating equation. Panels B and C are Manhattan plots showing the chromosomal distribution of p-values for alcohol dependence and symptom count, with green and red dots indicating significant events in the discovery and replication cohorts, respectively. Panels D and E are forest plots displaying the effect sizes (beta) and 95% confidence intervals for six replicated splicing events across the COGA and OZ-ALC cohorts.
Functional analysis exemplified by the ELOVL7 splicing event.A Schematic of two ELOVL7 splice variants. Splicing pattern and gene structure were adapted from Ensembl genome browser. The skipped exon (SE) is highlighted. Open and filled boxes represent untranslated and protein coding regions, respectively. B Sample stratification. CMC samples with genetically imputed PSI values greater than the level marked by the red dashed line (n = 200) were labeled as high, and those less than the level marked by the blue dashed line (n = 139) were labeled as low. Intervening samples (n = 41) were unused. C Differentially expressed (DE) genes. Volcano plot shows the -log10 FDR (y-axis) versus the log2 fold-change (FC, x-axis). Red dots are the differentially expressed genes between the high and low PSI groups in (B) with FDR < 0.05. D Gene Ontology (GO) pathway enrichment of DE genes. Pathways enriched by the DE genes with FDR < 0.05 are shown with the respective gene ratios. The color represents the FDR, i.e., the Benjamini-Hochberg-adjusted p values. The size of dots indicates the gene count. E, F Examples of two neural pathways enriched in GSEA. These pathways were enriched by genes upregulated in samples having high PSI level. The green line is the running enrichment score and the red dished line marks the maximum of score that corresponds to the leading-edge subset of genes that optimally contribute to the enrichment. Genes (black bars) were ranked high (red) to low (blue) based on log2 FC between the high and low PSI groups in (B). The normalized enrichment score (NES) and the FDR of enrichment are shown. G–J Results of Generalized Summary data-based Mendelian Randomization (GSMR). Effect sizes of SNV (x) on trait (y), βx,y, were plotted versus the effect sizes of SNV (x) on splicing (Ψ), βx,Ψ. The estimated slope (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat \beta _{{{\Psi }},y}$$\end{document}β^Ψ,y), which is the coefficient of least-square (LS) regression, is shown with the p value. G Psychiatric Genomics Consortium (PGC) GWAS of alcohol dependence (AD, DSM-IV). H Million Veteran Program (MVP), alcohol use disorder (AUD, ICD10). I UK Biobank (UKB), problematic drinking (AUDIT-P). J GWAS and Sequencing Consortium of Alcohol and Nicotine Use (GSCAN), drinks per week (DrnkWk).
LLM interpretation
This figure presents a functional analysis of *ELOVL7* splicing across several panels. It includes a gene structure schematic (A), a histogram of sample stratification by imputed PSI (B), a volcano plot of differentially expressed genes (C), and a dot plot of GO pathway enrichment (D). Additionally, it shows GSEA enrichment plots for glial cell differentiation and neurogenesis (E, F), and four scatter plots (G–J) showing GSMR results with least-square regression slopes and p-values relating splicing to various alcohol-related traits.
Impact of the ELOVL7 exon skipping event in the brain.Regions showing significant changes of gray matter volume in UK Biobank (UKB) subjects with high cis-regulated PSI (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^) compared with individuals having low \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^. FDR of the changes were mapped to the Desikan–Killiany atlas regions.
LLM interpretation
This figure consists of brain anatomical maps showing regions of significant gray matter volume changes in UK Biobank subjects based on ELOVL7 exon skipping levels ($\hat{\Psi}$). The visualization uses a color scale to map False Discovery Rate (FDR) values, ranging from 0.00 (dark red) to 0.03 (light yellow). Labeled regions of significance include the lateral occipital (left), bankssts (left and right), transverse temporal (left and right), fusiform (right), calcarine, and entorhinal (left and right) cortices.
No entities extracted from this document yet.
No uploaded files.
No papers in this knowledge base cite this source.
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Ethanol induces neuroimmune dysregulation and soluble TREM2 generation in a human iPSC neuron, astrocyte, microglia triculture model. | Boreland AJ et al. | — | 2026 | → |
| Genetic substructure in Latin American individuals reveals novel associations, mechanistic insights, and variable polygenic risk score transferability for alcohol traits | Montalvo-Ortiz J et al. | — | 2026 | — |
| Alternative splicing in addiction. | Bhatnagar A et al. | — | 2025 | → |
| Alternative Splicing: Molecular Mechanisms, Biological Functions, Diseases, and Potential Therapeutic Targets. | Zhu ZM et al. | — | 2025 | → |
| It is not just about transcription: involvement of brain RNA splicing in substance use disorders. | Carvalho L et al. | — | 2024 | → |
| Multimodal analysis of RNA sequencing data powers discovery of complex trait genetics. | Munro D et al. | — | 2024 | → |
| Multi-omics profiling of DNA methylation and gene expression alterations in human cocaine use disorder. | Zillich E et al. | — | 2024 | → |
| The genetical genomic path to understanding why rats and humans consume too much alcohol. | Tabakoff B et al. | — | 2024 | → |