When target scores derived from associations in the COGA data set are used to predict case/control status for the matched population (i.e. EA or AA) in SAGE, the R2 estimates for both EAs and AAs are modest, but statistically significant (Fig. 1). Maximum values are observed for association P-value thresholds set at less than 0.05 (n = 6,790 risk alleles) for EA and 0.30 (n = 76,218 risk alleles) for AA target samples, accounting for 0.73% (P = 1.64 × 10−3) and 2.14% (P = 2.08 × 10−4) of the variation in AD status, respectively (Table S1); although both sets of R2 values begin to plateau at around the 0.05 or 0.10 thresholds. Given the heritability estimates of 50–80% for AD liability (Heath et al. 1997; Knopik et al. 2004), these results fall well short of the total additive genetic variation believed to underlie the illness. This discrepancy can be attributed in part to the statistical noise arising from the inclusion of non-associated markers, as well as the large number of small, individual estimates of AD effect, whose standard errors reduce the accuracy of the aggregate scores in predicting disease outcome despite their small sizes.