We defined case and control groups by selecting individuals from the middle and extremes of the FEV1 distribution among both heavy smokers (mean 35 pack-years) and never smokers. We developed a custom array to provide optimum genome-wide coverage of common and low frequency (MAF 1–5%) coding variants and rare (MAF <1%) coding variants relevant to the UK population; this platform also provided dense coverage of genomic regions implicated in lung health and disease. Spirometry data in UK Biobank were obtained using a Vitalograph Pneumotrac 6800 (Buckingham, UK) on at least two occasions. Sampling was undertaken such that equal numbers of males and females were selected in total and the numbers of individuals selected from each age–sex band were proportional to the number of individuals in the band being sampled (appendix pp 3–5). One consequence of this approach is that we enriched our sample for non-smoking individuals with airflow obstruction.