Chunk #16 — DISCUSSION

Source: Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits.
Embedded: yes

Text

As with any fixed-effect model selection strategy, such as stepwise linear multiple regression analysis, there is a risk of over-fitting effects. This can be a particular problem for the analysis of GWAS SNP data because the number of SNPs is typically much larger than the experimental sample size. The effects of selected SNPs tend to be overestimated (sometimes called the winner's curse) and, if the threshold for inclusion is less stringent, false positives could be included in the model. In both cases, the estimated residual variance will be too low. This can, in theory, be a runaway process, because the more SNPs that are selected in the model, the lower the apparent residual variance and the greater the number of remaining SNPs that will become significant and will be added to the model. In the general population, the expected value of the LD correlation between SNPs on different chromosomes or more than d Mb distant is zero, even though, in a particular sample, the observed value is nonzero due to finite sample size. In our method, we set the LD