A meta-analysis of behavioral measures will have most power if the same reliable and valid measurement instrument is administered in all cohorts. In practice, however, different instruments are often used, and, even when the instrument is the same, translations into different languages may cause problems. To tackle the problem that different inventories may not assess the same phenotype, we demonstrate how Item-Response Theory (IRT) test linking can be applied to map item data from different inventories to a common metric. We conduct such an analysis for Neuroticism and Extraversion personality traits, based on data from the Genetics of Personality Consortium (GPC). If different inventories indeed measure the same phenotype, the only requirement for this approach is that multiple inventories have been administered in at least a subset of individuals. That is, in order to be able to harmonize across different inventories, some participants must have filled in multiple inventories so that they can function as a “bridge” between inventories. This can be done if we assume that the true phenotype (personality) does not change between the multiple assessments. If this