Chunk #49 — Online Methods — 1. Data matrix, primary analysis and processing, quality control — 1.2 ChIP-seq and DNase-seq uniform reprocessing for consolidated epigenomes — a. Read mapping
Sequenced datasets from the Release 9 of the Epigenome Atlas involved mapping a total of 150.21 billion sequencing reads onto hg19 assembly of the human genome using the PASH read mapper34. These read mappings were used (except for RNA-seq data sets which were mapped as described above) for constructing the 111 consolidated epigenomes. Only uniquely mapping reads were retained and multiply-mapping reads were filtered out. BED files containing the mapped reads were obtained from http://genboree.org/EdaccData/Release-9/. Alignment parameters for each assay type and experiment are specified in the associated publicly accessible Release 9 metadata archived at GEO. For the ENCODE datasets, BAM files containing mapped reads were downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/. Only uniquely-mapping reads were retained and multiply-mapping reads were discarded. Replicates were pooled.