To check whether and to what extent the choice of the reference sample had an impact on CNS calling, we decided to choose two different reference samples, each comprising 106 parental pairs. Reference Sample 1 was a random sample of all parental pairs, whereas reference Sample 2 was based on those 106 parental pairs with the lowest mean BMI standard deviation score values out of all non-obese parental pairs (percentiles < 90th) in our sample (see Supplementary Material, Table S1 for details). In both GWAS discovery samples, we performed CNV calling based on either of the two reference samples and proceeded only those variations that were consistently assigned via both approaches (see QC and association testing section).