paperKB
coga / coga-kb
Help
Sign in

Chunk #52 — Rare variants, individual genomes and somatic variants

Source
An integrated encyclopedia of DNA elements in the human genome.
Embedded
yes

Text

To further study the potential effects of NA12878 genome variants on TF binding regions, we performed peak-calling using a constructed personal diploid genome sequence for NA1287873. We aligned ChIP-seq sequences from GM12878 separately against the maternal and paternal haplotypes. As expected, a greater fraction of reads were aligned than to the reference genome (see Supplementary Information, Supplementary Figure K1). On average, approximately 1% of TF-binding sites in GM12878 are detected in a haplotype-specific fashion. For instance, Figure 9B shows a CTCF-binding site not detected using the reference sequence that is only present on the paternal haplotype due to a 1-bp deletion (see also Supplementary Figure K2). As costs of DNA sequencing decrease further, optimized analysis of ENCODE-type data should use the genome sequence of the individual or cell being analyzed when possible.