We first used Item Response Theory to evaluate differential item functioning (Thissen, Steinberg, & Wainer, 1993) using IRTLRDIF software (Thissen, 2001) as a test of whether items functioned similarly in relation to the underlying construct of externalizing symptoms across important subgroups based on child age (ages 2–11 vs. 12–17 for parent-reports and ages 10–13 vs. 14–17 for adolescent-reports), gender and study membership (i.e., MLS vs. AFDP). To do so, we relied on a calibration sample containing one randomly selected observation for each individual from among the repeated waves of assessment (N=1026, N=938 and N=966 for mother-, father- and adolescent- reports, respectively).2 For mother-reported symptoms, 11 items showed differential item functioning across age groups and 6 items did so for gender or study membership (with some items showing differential item functioning for more than one group indictor). For father-reported symptoms, 14 items showed differential item functioning over age, 12 did so over gender and none did so across study membership. For adolescent-reported symptoms, 14 showed differential item functioning over age, seven did so across gender and none did so across study. On the whole, reporters varied considerably in the pattern of differential item functioning.