paperKB
coga / coga-kb
Help
Sign in

Chunk #22 — Statistical methods for the analysis of rare variants

Source
Exome sequencing and the genetic basis of complex traits.
Embedded
yes

Text

An important consideration for exome sequencing studies is selecting the significance threshold that accounts for multiple testing. A simple way is to adopt a Bonferroni correction for 20,000 independent tests (one test per each gene), which, for an experiment- wide significance of 0.05 gives a p-value threshold of 2.5 × 10−6 per gene. However, such a threshold may be overly conservative because it assumes that each tested gene has sufficient variation to achieve the asymptotic properties for the test statistic. For example, if only 2 individuals carry non-synonymous variants in a given gene, the difference between cases and controls never exceeds 2 total observations, and so the most significant p-value that can be achieved is around 0.25 assuming that these 2 variants are independent. Therefore, unless the study is large, association p-values will be generally less significant than expected under the null hypothesis. Figure 2a demonstrates this effect on the 438 whole exomes. The PLINK/SEQ suite computes from data the so-called i-stat, which is an estimate of the minimal achievable p-value for a gene. The i-stat can be used by