In terms of sample overlap with 1000GP, CG1 and CG2 contain 34 and 125 samples respectively. We used genotypes of these samples to measure discordance with the 1000GP call sets. Since CG genotypes were derived from an average coverage of 80×, we assume that they are accurate and thus can be considered as the truth in the validation process. We define the discordance as being the percentage of these CG genotypes that are miscalled by a software (Beagle, Thunder or SHAPEIT). We measure both the overall (ALL) discordance and the discordance at genotypes with at least one non-reference allele (ALT). In all discordance measures, we systematically exclude all genotypes at SNPs included in the Omni2.5M chips.