To optimize the Hidden Markov Model, we used the reference file hh550.hmm and ran “-train” in PennCNV in one batch of thirty samples with the lowest s.d. of the LRR value followed by two batches that included random representative samples. We also created a population B allele frequency definition file specifically adapted to the Perlegen data. This allowed for CNV calls to be made in 1,887 (642 cases and 1,245 parents) out of 2,789 Perlegen 600K samples available. Although the global s.d. of the LRR value was below 0.2 for the majority (84%) of the samples, the intensity data was noisy in regions of called CNVs and showed subpopulations of SNPs that were unable to differentiate the deletion signal, perhaps as a result of PCR saturation during lab processing. Nevertheless, deletion and duplication features were detected with confirmation of homozygote and AAB and ABB genotypes (Supplementary Figs. 6 and 7). Lastly, Perlegen CNV calls were screened for overlap with the 12 loci associated with ADHD based on the CHOP Illumina data. To ensure that each detected CNV was a true DNA feature, we validated each CNV using qRT-PCR (Supplementary Fig. 5).