We next asked whether these 27 events could be replicated in an independent dataset. To address this question, we conducted the same analysis using the Australian Twin-family Study of Alcohol Use Disorder (OZ-ALC) dataset, which included 2856 individuals [24]. We found that six of the 27 SE were replicated with FDR < 0.05; these events were among the top candidates ranked by p-value from the COGA discovery cohort (Fig. 2B, C). Moreover, the effect sizes of all six SE were consistent in both the COGA and OZ-ALC cohorts (Fig. 2D, E). Detailed information for the 6 events is summarized in Table 1.Table 1Causal exons for alcohol use disorder.Skipped exon (SE)Exon functionPredictive modelingDiscoveryReplicationSEHost geneExonChromosomeEffects host gene expressionProtein codingFunctional domain (Pfam)Num. of samples (CMC)Num. of variantsP-valueR2PhenotypeFDRFDR1LINC00665ENSE00002438745chr19NoNoN.A.38033.42E-8764.4%SXCT0.0210.0052NSUN4ENSE00001875548chr1YesNoN.A.73723.75E-4222.1%SXCT0.0310.0053SRRM2ENSE00002674786chr16NoNoaNo38064.86E-063.1%DSM-IV0.0320.0294ELOVL7ENSE00002079807chr5YesNoN.A.380119.13E-4239.5%SXCT0.0310.0335DRC1ENSE00003572542chr2YesYesPF14772338155.99E-3938.9%SXCT0.0310.0336TBC1D5ENSE00001693995chr3N.A.NoN.A.376141.66E-2726.4%SXCT0.0310.033Genome assembly: GRCh38/hg38. PF14772: Dynein regulatory complex protein 1/2, N-terminal domain. DSM-IV indicates alcohol dependence diagnosis. SXCT: symptom count.Sample numbers for SXCT and DSM-IV in discovery cohort are 7421 and 4760, respectively. Replication sample number is 2856.aContains a stop codon triggering nonsense mediated decay.