Chunk #12 — Methods — Sequence base-calling, mapping to genome, data normalization and statistical analysis

Source: GABAergic gene expression in postmortem hippocampus from alcoholics and cocaine addicts; corresponding findings in alcohol-naïve P and NP rats.
Embedded: yes

Text

Sequences were called from image files with the Illumina Genome Analyzer Pipeline (GApipeline) and aligned to the reference genome (UCSC hg18 for human and UCSC rn4 for rat) using Extended Eland in the GApipeline. A total of 3 million uniquely mapped RNA-Seq reads for each human and rat sample were retrieved from export.txt files (output of Extended Eland). Based on their mapping locations, these selected reads were parsed with in-house Perl scripts to generate base coverage in WIG file format. After moving average smoothing, the chromosome locations of enrichment peaks were identified from pooled WIG files using in-house Perl scripts. The average sequencing reads of the most abundantly covered 50 bp in a single exon within an annotated Ref-Seq gene were counted for each sample. The read counts were then log2 transformed and normalized using quantile normalization (BioConductor limma package).