Chunk #12 — Materials and methods — Simulation of Genome-Wide Scores for Different Disease Models

Source: Common biological networks underlie genetic risk for alcoholism in African- and European-American populations.
Embedded: yes

Text

Using the program GCTA, case-control phenotypes for six disease architectures were simulated using real genotype data from the COGA and SAGE data sets, pruned of SNPs in strong LD, as described above. The phenotypes were generated from a simple additive genetic model yj = Σi xijbi + ej, where xij is the number of reference alleles for the ith causal variant of the jth individual, bj is the allelic effect of the ith causal variant, and ej is the residual effect generated from a normal distribution with mean 0 and variance of (xijbi)(1 − 1/h2). The six selected disease models differ with regards to the number of causal loci (100, 1,000 or 5,000) and their allele frequency profiles (MAF < 0.05 or MAF ≥ 0.05). For each of the population samples, a new AD status was assigned via a disease liability threshold, with the number of cases matching those in the original phenotype data. Causal loci were randomly selected from LD-pruned SNPs excluded from the initial two-stage, genome-wide scoring analysis, which have not been filtered for MAF and thus include