In the first decade of research about the 5-HTTLPR G×E interaction, scientists have frequently taken advantage of existing data sets, quickly adding genotype data to studies that had previously measured depression and life events for other purposes. Not all of these studies’ designs and measures are well-suited to testing the G×E hypothesis. Covariation between poor measurement quality and negative findings was observed early on (16) and has been confirmed with the increasing number of published G×E studies (17). Notably, many of the largest studies in Table 1 and Table 2 were obliged to collect brief retrospective self-reports of stress through telephone interviews or postal questionnaires in order to contain data collection costs. Thus, unfortunately, large sample size tends to coincide with poor measurement quality, and meta-analyses that give larger samples greater weight in estimating an effect across studies further compound this problem. There is hope that a new generation of cohort studies purpose-built for testing G×E interactions will improve replicability, but these must correct the problems of exposure measurement discussed in the previous paragraph, lest they merely repeat the problems on a far larger scale.