We processed 998 TB of raw exome sequence data from 302,355 UKB participants through a cloud-based bioinformatics pipeline (Methods). Through stringent quality control, we removed samples with low sequencing quality and from closely related individuals (Methods). To harmonize variable categorization modes, scaling, and follow-up responses inherent to the phenotype data, we developed PEACOK, a modification of the PHESANT package24 (Methods).