paperKB
coga / coga-kb
Help
Sign in

Chunk #46 — ONLINE METHODS — Secondary analyses in core PGC dataset — Gene Set Enrichments — Tissue and cell types

Source
Mapping genomic loci implicates genes and synaptic biology in schizophrenia.
Embedded
yes

Text

Datasets were processed uniformly25. First, we calculated the mean expression for each gene for each type of data if these statistics were not provided by the authors. We used the pre-computed median expression (transcript per million (TPM)) across individuals for the GTEx tissues (v8). For the GTEx dataset, we excluded tissues with less than 100 samples, merged tissues by organ (with the exception of brain tissues), excluded non-natural tissues (e.g. EBV-transformed lymphocytes) and testis (outlier in hierarchical clustering), resulting in 37 tissues. Genes without unique names and genes not expressed in any cell types were excluded. We scaled the expression data to 1M Unique Molecular Identifiers (UMIs) or TPM for each cell type/tissue. After scaling, we excluded non-protein coding genes, and, for mouse datasets, genes that had no expert curated 1:1 orthologs between mouse and human (Mouse Genome Informatics, The Jackson laboratory, version 11/22/2016). We then calculated a metric of gene expression specificity by dividing the expression of each gene in each cell type/tissue by the total expression of that gene in all cell types/tissue, leading to values ranging from