paperKB
coga / coga-kb
Help
Sign in

Chunk #55 — ONLINE METHODS — RNA Sample QC

Source
Gene expression elucidates functional impact of polygenic risk for schizophrenia.
Embedded
yes

Text

RNA-seq outliers were detected using two methods in parallel. To evaluate the data for outliers, one group of analysts used four approaches to normalization: FPKM (fragments per kilobase per million reads) from Cufflinks; quantile normalization across samples; quantile normalization across genes; and trimmed mean of M values (TMM) from the edgeR package62,63. We applied three different methods of analysis to these normalized data sets: Hierarchical Clustering with average linkage (HC); the number of extreme transcripts (NT: the number of transcripts with expression value outside the 95% confidence interval for the transcript, across individuals); and Principal Component Analysis (PCA). For HC, a sample (or small group of samples) was declared an outlier if it did not cluster with other samples. If NT > 7.6% of total transcripts, it was declared an outlier. Finally, if the PCA revealed a sample or small group of samples represented by a leading PC (largest 5), it was declared an outlier. When combining these results, if a sample was declared an outlier by all three methods, it was labeled an outlier.Separately, another group of analysts applied