Chunk #37 — ONLINE METHODS — Data acquisition, quality control, and normalization — RNA-Seq data.

Source: Integrative transcriptome analyses of the aging brain implicate altered splicing in Alzheimer's disease susceptibility.
Embedded: yes

Text

The RNA-Seq data were processed by a parallelized pipeline. This pipeline includes trimming the beginning and ending bases from each read, identifying and trimming adapter sequences from reads, detecting and removing rRNA reads, and aligning reads to reference genome. Specifically, RNA-Seq reads in FASTQ format were inspected using FASTQC program. Barcode and adapter contamination, low-quality regions (8bp at beginning and 7bp at ending of each FASTQ reads) were trimmed using FASTX-toolkit. To remove rRNA contamination, we aligned trimmed reads to rRNA reference (rRNA genes were downloaded from UCSC genome browser selecting the RepeatMask table) by BWA then extracted only paired unmapped reads for transcriptome alignment. STAR (v2.5)56 (was used to align reads to the transcriptome reference, and RSEM (v1.3.0)57 was used to estimate expression levels for all transcripts. To quantify the contribution of experimental and other confounding factors to the overall expression profiles, we used the COMBAT algorithm58 to account for the effect of batch and linear regression to remove the effects of RIN, post-mortem interval (PMI), sequencing depth, study index (ROS sample or MAP sample), genotyping PCs, age at