We also compared GPA with the LMM-based approach [21], [40] for this dataset. Specifically, we considered the following genome-partitioning LMM: (1)where are covariates (the first five principal components from genotype data), and and are sets of SNPs overlapping DHSs in each cell line and the remaining SNPs, respectively. We denote the numbers of SNPs in and as and , respectively. The median number of SNPs that overlap DHS in each cell line is about 60K and 90% of cell lines have the number of DHSs ranging between 40K and 80K. In order to take into account such variation in DHS number among cell lines, we define a scaled version of the proportion of phenotype variance explained by SNPs overlapping DHSs in each cell line as (2)where is the proportion of the explained variance and is the scaling factor. The right panel of Figure 8 shows that the ()-transformed p-value of the GPA annotation enrichment test is linearly related to . This indicates that our GPA model captures enrichment of annotation almost as accurately as LMM even without the original genotype data, implying its broader applicability than methods requiring individual genotype and phenotype data.