Chunk #11 — 2 Methods — 2.3 Statistical analysis — 2.3.2 Estimation of variance explained by top ranking SNPs

Source: Exploring the genetic architecture of alcohol dependence in African-Americans via analysis of a genomewide set of common variants.
Embedded: yes

Text

To explore further the genetic architecture of AD, we estimated the variance explained by the top ranking SNPs based on different P-value thresholds. To minimize the effect of the “winners curse” [35], we used the following strategy: Step 1: We randomly partitioned the entire data set into two halves, denoted as D(1) and D(2).Step 2: We used the first half of the data, D(1), to perform the association test and calculate the P-values. To account simultaneously for the confounding effects of population structure, family structure, and cryptic relatedness, we used LMM [11,34], which is capable of correcting for these confounding effects [20,23]. Specifically, we used the GEMMA program [34] to calculate the P-values based on D(1).We selected the top ranking SNPs based on their P-values using nine different P-value thresholds [0.5, 0.4, 0.3, 0.2, 0.1, 0.05, 0.01, 0.005, 0.001], and used the samples in D(2) to fit LMM for chip-heritability estimation, as described in the previous section.Step 4: We repeated Steps 1-3 B times. Here we chose B to be 50.