Samples are designated for exclusion if they are of questionable identity (e.g. unresolved gender mismatch) or of unacceptable quality (e.g. appear to be contaminated). All remaining samples are posted on dbGaP, but we recommend that filters are applied prior to association testing. In some cases, the filters apply only to certain chromosomes of a sample (e.g. chromosome aberrations). We recommend filtering out samples with an overall missing call rate greater than 2% and those that are PCA outliers from all major ethnic groups in the study. The percentages of data lost by application of the sample filters is 1.6% for Addiction, 0.5% for Lung Cancer, 1.6% for T2D NHS and 0.5% for T2D HPFS.