To quantify how well a protein-coding gene is represented across all individuals by the exome sequence data, we estimated informativeness statistics for each studied gene on the basis of sequencing coverage across the available exomes (Supplementary Methods, Supplementary Table 24). Moreover, we created dummy phenotypes to correspond to each of the four exome sequence delivery batches to identify and exclude from analyses genes and variants that reflected sequencing batch effects; we provide these as a cautionary list resource for other UKB exome researchers (Supplementary Methods, Supplementary Tables 25–27).