Chunk #0 — Results — A compendium of in silico gene regulatory annotations.

Source: Improving the trans-ancestry portability of polygenic risk scores by prioritizing variants in predicted cell-type-specific regulatory elements.
Embedded: yes

Text

To capture genetic variation of diverse polygenic diseases and quantitative traits, we constructed a comprehensive compendium of 707 cell-type-specific regulatory annotation tracks. We applied the IMPACT31 framework to 707 unique TF-cell-type pairs obtained from a total of 3,181 TF ChIP–seq datasets from NCBI, representing 245 cell types and 142 TFs with known sequence motifs (Methods, Supplementary Table 1 and Extended Data Fig. 1)32. We provide publicly available open-source software corresponding to our analyses. IMPACT learns an epigenetic signature of active TF binding evidenced by ChIP–seq by differentiating bound from unbound TF sequence motifs using logistic regression. We derive this signature from 5,345 epigenetic and sequence features, predominantly generated by ENCODE33 and Roadmap34 (Methods, Supplementary Table 2 and Extended Data Fig. 1) and representing the biological diversity of the 707 candidate models (Fig. 2a). IMPACT probabilistically annotates each nucleotide genome wide on a scale from 0 to 1, without using the TF motif, to indicate regulatory regions that are similar to those that the TF binds. We extensively tested the quality and cell-type specificity of these 707 IMPACT annotations (Supplementary Note).