To identify poor quality markers, we used statistical tests designed primarily to check for consistency across experimental factors, such as array or batch (see Methods; Extended Data Table 4). As a result of these tests, we set to missing 0.97% of all the genotype calls made by Affymetrix. We identified poor quality samples using the metrics of missing rate and heterozygosity adjusted for population structure (Extended Data Fig. 1), as extreme values in one or both of these metrics can be indicators of poor sample quality due to, for example, DNA contamination15. We identified 968 such samples (0.2%), and provide this list to researchers.