Chunk #73 — Methods — Simulations

Source: Power and predictive accuracy of polygenic risk scores.
Embedded: yes

Text

Then, a case/control design was simulated in which the disease prevalence was now 0.001. The same total sample sizes were used but included equal numbers of cases and controls. A computationally efficient approach to this simulation is described in Text S2. The results are given in Table S3. Again all simulations are seen to agree with the analytic values, but when the number of markers with effects is low, there is a downward bias in the parameter estimates and the confidence intervals of the parameter estimates are anti-conservative. Again the logistic regression results agree well with those for linear regression. Taking Tables S1, S2, S3 together, the analytic methods are accurate for the strongest effects likely to be seen in current studies, but when the number of SNPs with effects is about 1000, there is downward bias in the effect estimates and under-coverage of the confidence intervals, the degree of which appears to vary with the strength of the association.