Chunk #6 — Results — Overview of methods

Source: Partitioning heritability by functional annotation using genome-wide association summary statistics.
Embedded: yes

Text

To apply stratified LD score regression (or REML) we must first specify which categories we include in our model. We created a “full baseline model” from 24 publicly available main annotations that are not specific to any cell type (Supplementary Table 1; see URLS and Online Methods). Below, we show that including many categories in our model leads to more accurate estimates of enrichment. The 24 main annotations include: coding, UTR, promoter, and intron [14, 17]; histone marks H3K4me1, H3K4me3, H3K9ac [3–5] and two versions of H3K27ac [18, 19]; open chromatin reflected by DNase I hypersensitivity Site (DHS) regions [5, 14]; combined chromHMM/Segway predictions [20], which make use of many ENCODE annotations to produce a single partition of the genome into seven underlying “chromatin states”; regions that are conserved in mammals [21, 22]; super-enhancers, which are large clusters of highly active enhancers [19]; and enhancers with balanced bidirectional capped transcripts identified using cap analysis of gene expression in the FANTOM5 panel of samples, which we call FANTOM5 enhancers [23]. For the histone marks and other annotations that differ among cell