Chunk #80 — ONLINE METHODS — Normalization of Gene Expression and Adjustment for Covariates — Isoform-level normalization and analyses

Source: Gene expression elucidates functional impact of polygenic risk for schizophrenia.
Embedded: yes

Text

Relative isoform abundances were estimated using the MISO software package. The estimates of PSI (percent spliced in; i.e., fraction of each isoform of a gene expressed) and their standard deviations of those estimates, were calculated for a total of 160,305 isoforms. The isoforms were initially filtered to include only those deriving from genes expressed at a CPM > 1 in at least 50% of the samples (the same 16,423 genes used in gene-level analyses). To obtain absolute abundance estimates of isoform expression (“isoform-assigned” CPM), the isoform PSI values were multiplied by their respective effective isoform lengths 67 to control for variable isoform length, re-normalized to sum to 1, and then multiplied by the HTSeq gene-level read counts, which were then converted to isoform-level CPM, and log(CPM), using voom. Next, we retained only isoforms that had sufficient expression for analysis (CPM > 0.5 and PSI > 0.01 in more than 50% of the samples) and sufficiently well-estimated PSI (standard deviation across MISO iterations of PSI estimate < 0.1, and a coefficient of variation on the estimate < 0.5 in more than