Chunk #14 — Statistical approach — SNP selection

Source: Pathway based analysis of genotypes in relation to alcohol dependence.
Embedded: yes

Text

We expect that most typed variants will not be even weakly associated with AD risk, and statistical testing would be weakened by including all variants. We therefore selected subsets of SNPs across gene sets in two ways. First we followed standard practice for building a multivariate linear model based on increase in log-likelihood. We set a relatively modest threshold of a log likelihood increase of 2.5 for inclusion in a model. Second we implemented a faster procedure by selecting a subset of alleles that were potentially individually associated with AD using a liberal threshold for inclusion of p < 0.20. Both these inclusion criteria are fairly weak, and statistical significance cannot be assessed by conventional means. For both procedures we prevented from entering the model those variants in LD (r2 > .25) with variants already in the model.