paperKB
coga / coga-kb
Help
Sign in

Chunk #56 — Methods — Defining the study-wide significant cut-offs for collapsing analyses

Source
Rare variant contribution to human disease in 281,104 UK Biobank exomes.
Embedded
yes

Text

We used a synonymous collapsing analysis model as an empirical negative control. Here it is expected that synonymous variants will generally not significantly contribute to disease risk and could thus act as a useful empirical negative control for study-wide P value thresholding. Across the 17,361 studied binary phenotypes and 18,762 studied genes, we observed a distribution of 325,727,082 Fisher’s exact test statistics corresponding to the synonymous collapsing model. At the tail of this distribution for binary traits, we identified two genuine relationships: IGLL5 synonymous variants enriched among ‘Union#C911#C91.1 chronic lymphocytic leukaemia’ (P = 2.5 × 10−11) and its parent node ‘Union#C91#C91 lymphoid leukaemia’ (P = 1.2 × 10−10). Following this, we observed a tail of P values beginning from P = 2.2 × 10−8 (Supplementary Table 20). Similarly, for the 1,419 quantitative phenotypes, we observed a distribution of 26,623,278 Fisher’s exact test statistics corresponding to the synonymous collapsing model. At the tail of this distribution, we identified two genuine relationships: MACROD1 synonymous variants correlating with decreased levels of ‘Urate’ (P = 2.8 × 10−30)56 and ALPL synonymous variants correlating with