Chunk #32 — Methods — Gene expression analyses

Source: Expression quantitative trait loci in the developing human brain and their enrichment in neuropsychiatric disorders.
Embedded: yes

Text

Sequencing adapters were removed from reads with Cutadapt using the Trim Galore! Wrapper script. QC analyses were carried out using FastQC before and after adapter trimming, and additionally using RSeQC [59] after mapping to the GRCh38 human genome reference sequence with HISAT2 [60]. We used bcftools v1.6 to call SNPs in the mapping files and compared them to the imputed genotypes using bcftools gtcheck. Transcript abundance was quantified by pseudoalignment of sequencing reads to transcript sequences derived from the GRCh38 human genome reference sequence and Ensembl (version 81) reference annotation using Kallisto [61]. Reads were aggregated at the gene level using tximport [62] and biomaRt [63]. Between-sample normalization and variance-stabilizing transformation were carried out using DESeq2 [64], and genes/transcripts that had VST-normalized count values > 5 in 10 or more samples were included in all subsequent analyses. The expression matrix was quantile-normalized before and after correcting for covariates (see below), and principal component analysis was used to assess between-sample heterogeneity (Additional file 2: Figure S1). The influence of covariates on gene expression was also quantified using the variancePartition package in R [65] (Additional file 2: Figure S2).