paperKB
coga / coga-kb
Help
Sign in

Chunk #83 — STAR★METHODS — METHOD DETAILS — Estimation of Number of TS Genes — Maximum Likelihood Estimation

Source
De Novo Coding Variants Are Strongly Associated with Tourette Disorder.
Embedded
yes

Text

For each possible number of risk genes, from 1 to 2,500, we simulated 192 variants. We repeated this 50,000 times. In each permutation, we randomly selected risk genes and randomly assigned a percentage of variants to the risk genes and the rest of the variants to the non-risk genes. We based these percentages on the fraction of damaging variants estimated to carry risk (27.3%; see below). We utilized the per gene probabilities of mutation from TADA, which are weighted by gene size and GC content (He et al., 2013; Sanders et al., 2012). We then counted the number of both risk and non-risk genes that harbored multiple variants and recorded when the number of genes with two variants and the number of genes with three or more variants in the simulated data matched the number observed in our study (4 and 1, respectively). We then calculated the frequency of concordance between the permuted data and the observed data. Finally, we determined the MLE by plotting the smoothed trend line of frequency versus number of risk genes using local polynomial regression