Probes on the 20 array set were designed with a relaxed threshold for multiple matches to the reference genome to maximise coverage and allow screening of moderately repetitive sequences. The array data were generated at NimbleGen’s Icelandic service facility. Experiments were repeated and quality-control filters were applied to improve the data consistency. Data were normalized to minimize variation between experiments; putative CNVs were detected as chromosomal segments with unusually high or low log2 ratios of fluorescent intensity between the test and reference genomes using the genome alteration detection analysis (GADA) algorithm48. Further filtering reduced false positives.