We identified poor quality samples using the metrics of missing rate and heterozygosity computed using a set of 605,876 high quality autosomal markers that were typed on both arrays (see Supplementary Information for criteria). Extreme values in one or both of these metrics can be indicators of poor sample quality due to, for example, DNA contamination15. The heterozygosity of a sample—the fraction of non-missing markers that are called heterozygous—can also be sensitive to natural phenomena, including population structure, recent admixture and parental consanguinity. We took extra measures to avoid misclassifying good quality samples because of these effects. For example, we adjusted heterozygosity for population structure by fitting a linear regression model with the first six principal components in a PCA as predictors (Extended Data Fig. 1). Using this adjustment we identified 968 samples with unusually high heterozygosity or >5% missing rate (Supplementary Information). A list of these samples is provided as part of the data release.