We stratified the CMC samples according to the high and low levels of the genetically determined \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^, for each skipped exon identified in COGA and replicated in OZ-ALC. Read counts for the respective groups of samples (G1: low \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^; G2: high \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\hat{\mathrm \Psi }}}}$$\end{document}Ψ^) were retrieved from the RNA-seq data and a gene-by-sample read count matrix was constructed. We considered only the autosomal genes and removed low expression genes, which were defined by ≤ 1 CPM in more than N samples, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N = \frac{1}{2}min\left( {n_1,n_2} \right)$$\end{document}N=12minn1,n2; n1 and n2 are the sample numbers in G1 and G2, respectively. We used the TMM method in the R package EdgeR (version 3.34.1) [34] to normalize the read counts. Differentially expressed genes were identified in EdgeR using a negative binomial model with adjustments for two covariates: sex and sequencing cohort. Cutoff of significance was FDR < 0.05.