Chunk #70 — STAR★METHODS — METHOD DETAILS — Burden Analyses — Poisson Regression

Source: De Novo Coding Variants Are Strongly Associated with Tourette Disorder.
Embedded: yes

Text

shown). We determined that paternal age, sequencing coverage (percent of exome at 2× coverage), sequencing coverage uniformity (fold 80 base penalty), and heterozygous SNP quality provided the best model. Additionally, however, we reasoned that the number of de novo synonymous mutations per individual could potentially control for additional batch effects affecting the rate of de novo variant detection. Indeed, when we included the number of de novo synonymous variants, along with the aforementioned covariates, in a Poisson regression to predict the number of de novo nonsynonymous mutations (we chose nonsynonymous because coding mutations include the synonymous mutations), we observed a stronger model (better AIC) than excluding de novo synonymous variants. We used the size of the callable coding exome as an offset because each base pair represents an opportunity for a de novo variant. Therefore, the final model to estimate the rate ratios, confidence intervals, and p values for association was: number of de novo variants∼phenotype+paternal age+percent of bases≥2X+fold80base penalty+heterozygous SNP quality+number of de novo synonymous variants+offset(log(callable coding bp))