paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #43 — ONLINE METHODS — Expression analysis

Source
Large-scale genotyping identifies 41 new loci associated with breast cancer risk.
Embedded
yes

Text

Gene expression, copy number and genotype data were retrieved from the TCGA breast cancer study. Gene expression profiles were measured by TCGA using a custom Agilent 244K expression array. We downloaded the raw expression data and performed preprocessing using the limma R package. Copy number and germline genotype were both measured using the Affymetrix Genome-Wide Human SNP 6.0 array. We used the segmented copy number and called genotype data as provided by TCGA. Intersecting the different genomic data types, we collected 458 primary tumor samples with germline genotypes from blood and both gene expression and somatic copy number data from the tumor. In addition, for 61 samples, we had germline genotype and gene expression data from normal breast tissue from individuals in the TCGA breast cancer study. Expression quantitative trait locus (eQTL) analysis was performed on both sets separately. For cis-eQTL analysis, we considered all genes 50 kb upstream or downstream of the lead SNP. Fourteen of the risk-associated SNPs are represented directly on the Affymetrix SNP array. For an additional 23, we were able to select proxies on the