Chunk #13 — Materials and Methods — Subjects, genotyping and assignment of nominal significance of dependent vs control allele frequencies in each sample — 4) Monte Carlo methods for assignment of levels of significance to: a) the extent of clustering in each sample and b) the degree to which clustered nominally-positive SNPs from multiple independent samples identify the same chromosomal regions
We first tested the null hypothesis that chromosomal clustering of these nominally positive SNPs occurred at the level expected by chance in these datasets. For each Monte Carlo trial that tested this null hypothesis, we randomly selected a number of “pseudo positive” SNPs from each dataset that matched the number that achieved nominal significance in the bona fide dataset. Thus, we constructed a list of autosomal SNPs assayed in each sample and assigned a number to each SNP that corresponded to its position on the list. To select the pseudopositive SNPs for each trial of the European-American datasets, we selected 75,413 random numbers for the NIDA (see below) and 49,843 random numbers for the dbGAP datasets. For the African American datasets, we used 83,330 and 45,325 random numbers, respectively. For each trial, the SNPs identified by the positions on the list that corresponded to these randomly-assigned numbers were then queried for the extent to which their results equaled or exceeded the results obtained for the actual dataset. In 10,000 such trials for each sample, we compared results concerning the extent