The overlap analysis with these gold standard data sets was performed using the high-confidence CNV calls for each platform and algorithm combination. CNVs with a reciprocal overlap of at least 50% with gold standard variants were considered validated (Fig. 2c). For most platforms, at least 50% of the variants have been previously reported. There is better overlap with regions previously reported by array studies than regions originating from sequencing studies, which might be expected as all our CNV data stems from arrays. The overlap with CNVs identified by fosmid-end sequencing41 is low as most CNVs called in this work are below the detection limit of the fosmid-based approach . It is important to note that all samples included in the current study have also been interrogated in other studies represented in DGV using different platforms. This likely contributes to a higher overlap than what would be found with previously uncharacterized samples.