Chunk #15 — Materials and methods — Statistics

Source: Gene-wide analyses of genome-wide association data sets: evidence for multiple common risk alleles for schizophrenia and bipolar disorder and for overlap in genetic risk.
Embedded: yes

Text

The bipolar and schizophrenia datasets share the same controls, and therefore the test statistics are correlated. This means that we cannot calculate whether genes are associated to both disorders at a rate greater than chance simply based upon assumptions of independence, for example the hyper-geometric distribution. Instead we used permutations. We pooled together the three groups of individuals (bipolar and schizophrenia cases, and controls) and randomly assigned diagnostic category keeping the numbers of individuals in each group equal to that in the observed data. We assessed genes by comparing one randomly selected permutation (of 1000) with the 999 permutations randomly drawn (with replacement) from the pool of remaining permutations. From this we generated lists of the top associated genes that were equal in length to the corresponding list of genes for the observed datasets and then recorded the number of overlapping genes. This process was repeated 1000 times. We then calculated empirical p-values by counting how many times the simulated number of overlapping genes was greater than, or equal to, the observed number of overlapping genes in the true datasets.