Chunk #2 — CNV discovery and genotyping

Source: Origins and functional impact of copy number variation in the human genome.
Embedded: yes

Text

Within the context of a CNV association study conducted by the Wellcome Trust Case Control Consortium (WTCCC), a CNV-typing array was designed by the WTCCC in collaboration with the other co-authors of this paper in which a preliminary version of our discovery data was shared at an early stage with the WTCCC. The array used the Agilent CGH platform and comprised 105,000 long oligonucleotide probes. Its targets include 10,819 out of 11,700 (92%) of the candidate CNV loci, and 375 other loci from published CNV surveys, including 292 new sequence insertions (Supplementary Methods)5,18. To perform large-scale validation of candidate CNVs, we ran each of the 41 DNA samples used in the discovery phase of this study on the CNV-typing array against a pooled reference sample to minimize reference-specific artefacts. By comparing the correlation between the discovery data and the CNV-typing data across the same samples at each locus, we could distinguish probable false-positives and true CNVs (Supplementary Methods). Using this approach we estimated the false discovery rate to be 15%, in good agreement with the estimate obtained from the much smaller set of independent validation experiments using qPCR.