Chunk #97 — Online Methods — 10. Clustering of DNaseI-accessible regulatory regions to identify modules of coordinated activity

Source: Integrative analysis of 111 reference human epigenomes.
Embedded: yes

Text

We selected the number of clusters k by tuning the expected number of regions within each cluster to be approximately 1000 for promoter and dyadic regions, and approximately 10,000 for enhancer regions, given their much larger count (81k, 129k, and 2.3M for promoter, dyadic, and enhancer respectively). This results in a value of k=233 for enhancer clusters (for ~10k elements per cluster), and the algorithm converged on k=226 non-empty clusters, which are used for subsequent analyses.