paperKB
coga / coga-kb
Help
Sign in

Chunk #56 — ONLINE METHODS — RNA Sample QC

Source
Gene expression elucidates functional impact of polygenic risk for schizophrenia.
Embedded
yes

Text

group of samples represented by a leading PC (largest 5), it was declared an outlier. When combining these results, if a sample was declared an outlier by all three methods, it was labeled an outlier.Separately, another group of analysts applied two procedures to detect outliers on the TMM-normalized data, namely Inter Array Correlation (IAC49) and “Iterative” PCA (iPCA). IAC computes the pairwise correlation over genes for all pairs of samples, plots the distribution of the resulting correlations, and empirically finds outliers. Here we used 3 standard deviations as a threshold to declare a sample an outlier. Alternatively, for iPCA, the following algorithm was implemented: the first two PCs were computed from the data; samples beyond the 95% confidence envelope were identified and removed; then the first two PC were recomputed, outliers identified and removed; and so on, until no outliers were detected. All of the samples removed were declared outliers. The full set of samples labeled outliers was then the union of the IAC and iPCA sets.