paperKB
coga / coga-kb
Help
Sign in

Chunk #141 — ONLINE METHODS — Generation of gene sets for enrichment analyses of differential expression

Source
Gene expression elucidates functional impact of polygenic risk for schizophrenia.
Embedded
yes

Text

We sought to retain sets that were relevant to the DLPFC brain expression we observed here, as well as address overlap between the 3 databases, using the following strategy. We only retained a gene set in which at least 10% of the genes are expressed in DLPFC (that is, are among the 16,423 genes passing the expression-level threshold. For each set, we filtered out any genes not expressed in DLPFC. We then retained only sets with a final number of genes between 10 and 1,000. For adding the latter two databases, we did not include any set with a Jaccard overlap index > 0.5 to a GO set already included (since in such cases, a substantial portion of the genes were already included in the GO set and the added test would likely be redundant). This procedure yielded 2,902 gene sets in total: 1,938 sets from GO, 824 from Reactome, and 140 gene families.