paperKB
coga / coga-kb
Help
Sign in

Chunk #55 — Online Methods — 1. Data matrix, primary analysis and processing, quality control — 1.2 ChIP-seq and DNase-seq uniform reprocessing for consolidated epigenomes — d. Genome-wide signal coverage tracks

Source
Integrative analysis of 111 reference human epigenomes.
Embedded
yes

Text

We used the signal processing engine of the MACSv2.0.10 peak caller to generate genome-wide signal coverage tracks. Whole cell extract was used as a control for signal normalization for the histone ChIP-seq coverage. Each DNase-seq dataset was normalized using simulated background datasets generated by uniformly distributing equivalent number of reads across the mappable genome. We generated 2 types of tracks that use different statistics based on a Poisson background model to represent per-base signal scores. Briefly, reads are extended in the 5’ to 3’ direction by the estimated fragment length. At each base, the observed counts of ChIP-seq/DNaseI-seq extended reads overlapping the base are compared to corresponding dynamic expected background counts (λlocal) estimated from the control dataset. λlocal is defined as max(λBG, λ1K, λ5K, λ10K) where λBG is the expected counts per base assuming a uniform distribution of control reads across all mappable bases in the genome and λ1K, λ5K, λ10K are expected counts estimated from the 1 kb, 5 kb and 10 kb window centered at the base. λlocal is adjusted for the ratio of the sequencing depth of ChIP-seq/DNase-seq dataset relative to the control dataset. The two types of signal score statistics computed per base are as follows.