Chunk #71 — ONLINE METHODS — Normalization of Gene Expression and Adjustment for Covariates — Initial normalization of read counts

Source: Gene expression elucidates functional impact of polygenic risk for schizophrenia.
Embedded: yes

Text

To define the set of covariates for adjustment, we start by initially normalizing the HTSeq read count matrix for all 56,632 Ensembl genes, using voom without covariates. Next, we filtered out all genes with lower expression in a substantial fraction of the cohort, with 16,423 genes remaining with at least 1 CPM in at least 50% of the individuals; note that only these genes were carried forward into all subsequent analyses. This initially-normalized gene expression matrix was then used to select known covariates (described above). Next, hidden covariates were derived (for use in eQTL analyses only, as is common practice13). These covariates were then included for adjustment in the normalization and adjustment steps.