Alongside genome-wide association studies and upcoming sequencing studies, there is increasing interest in obtaining large-scale "omics" data from large biobanks and sample collections, including gene expression, proteomic and metabonomic profiling. These biobanks will rely on easy sample collection and handling using robust methodologies and sample storage over a prolonged time period. While the downstream gene expression profiling techniques using microarrays are very reliable for large-scale investigations, there are still challenges prior to microarray analysis including the choice of a relevant sample type and RNA and cell isolation method. Blood-based samples will continue to be one of the most readily available sources for gene expression studies in large-scale investigations. Several strategies - ranging from PAXgene (which captures RNA profiles of all cell types in whole blood and has no complex cell isolation procedures prior to RNA isolation) to the creation of lymphoblastoid cell lines (LCLs) comprising a transformed single cell type - have been developed. Other isolation methods attempt to generate a subset of cell types such as peripheral blood mononuclear cells (PBMCs) by the use of Ficoll or lymphocyte subsets using magnetic beads.