paperKB
coga / coga-kb
Help
Sign in

Chunk #27 — RESULTS AND DISCUSSION — Prostate cancer gene expression data

Source
Classification and selection of biomarkers in genomic data using LASSO.
Embedded
yes

Text

The example we consider is from a prostate cancer study; a subset of the samples was considered by Dhanasekaran et al [3]. We focus here on noncancer versus cancer tissues. The samples are profiled using spotted cDNA (ie, red/green) microarrays; there are initially 101 samples profiled using 10 K chips (9984 genes). We have taken the following preprocessing steps: remove genes that are reported as missing in more than 10% of the samples;remove genes that have a variance less than 0.05 in all samples;impute measurements for missing genes using the median. This leaves a total of 4880 genes for analysis.