Chunk #40 — Methods — Statistical analysis

Source: Variability of gene expression profiles in human blood and lymphoblastoid cell lines.
Embedded: yes

Text

For the clustering analysis, we used hierarchical clustering and PCA (using the NIPALS algorithm for estimating latent variables) on the normalised gene expression data of 7,305 probes that were detected across all 56 samples. In the PCA and PLS-DA analysis, the measurements of each expression probe were mean centered prior to the analysis. Using a PLS-DA model, we identified a set of transcripts that discriminates the method of interest from the other four methods. We computed a separate PLS-DA model for each method for which we set two classes as a response variable: one class for the method of interest and one class for the other four methods. We then extracted the w1 variable weights of the expression probes for each of the five PLS-DA models, ranked these variable weights and selected the 5% highest and 5% lowest ranked expression probes for each method. For a single vector, y, Trygg et al. suggested, that w1 should contain more useful interpretational information than the more commonly used regression coefficients [34].