paperKB
coga / coga-kb
Help
Sign in

Chunk #46 — Online Methods — 1. Data matrix, primary analysis and processing, quality control

Source
Integrative analysis of 111 reference human epigenomes.
Embedded
yes

Text

Release 9 of the compendium contains uniformly pre-processed and mapped data from multiple profiling experiments (technical and biological replicates from multiple individuals and/or datasets from multiple centers). In order to reduce redundancy, improve data quality and achieve uniformity required for our integrative analyses, experiments were subjected to additional processing to obtain comprehensive data for 111 consolidated epigenomes (See methods sections below for additional details). Numeric epigenome identifiers EIDs (e.g. E001) and mnemonics for epigenome names were assigned for each of the consolidated epigenomes. Table S1 (QCSummary sheet) summarizes the mapping of the individual Release 9 samples to the consolidated epigenome IDs. Key metadata such as age, sex, anatomy, epigenome class (see below), ethnicity and solid/liquid status were summarized for the consolidated epigenomes. Datasets corresponding to 16 cell-lines from the ENCODE project (with epigenome IDs ranging from E114-E129) were also used in the integrative analyses23. All datasets from the 127 consolidated epigenomes were subjected to processing filters to ensure uniformity in terms of read length based mappability and sequencing depth as described below.