Chunk #62 — Online Methods — 1. Data matrix, primary analysis and processing, quality control — 1.3 Methylation data cross-assay equalization and uniform processing for consolidated epigenomes
We used PASH38 alignments for the WGBS and RRBS read alignments. From the number of converted and unconverted reads at each individual CpGs the total coverage and fractional methylation were reported. The data were uniformly post-processed and formatted into two matrices for each chromosome. One matrix contained read coverage information for each base (C and G) in every CpG (row) and for each reference epigenome (column). Another matrix similarly contained fractional methylation ranging from 0 to 1. For the locations where coverage was <=3 we considered data as missing. For MeDIP/MRE methylation data we used the output of the mCRF tool31 that reports fractional methylation in the range from 0 to 1 and uses an internal BWA mapping. The mCRF results were combined in a single matrix per chromosome for all reference epigenomes where available.