To determine the probability of finding multiple rare de novo CNVs at the same location in probands, we first estimated how many likely positions in the genome were contributing to the observed de novo CNVs in siblings. As there are widely varying mutation rates for structural variation across the genome (Fu et al., 2010), some positions are more likely to result in de novo CNVs observed in our sample than others. Consequently, the likely number of positions is much smaller than the total possible number of positions. We refer to the likely CNV regions as eCNVRs (effective copy number variable regions) and calculate their quantity “C” using the so-called “unseen species problem” which uses the frequency and number of observed CNV types (or species) to infer how many species are present in the population. Based on the observed de novo CNVs in the control sibling group, we apply the formula (Bunge and Fitzpatrick, 1993) C = c/u + g2*d*(1-u)/u, in which c = the total number of distinct species observed; c1= the number of singleton species; d = total number