We assessed whether the IRT Neuroticism and Extraversion scores in the 23 cohorts were truly independent of the specific inventory used. First, the appropriateness of linking tests within cohorts was investigated by testing basic assumptions of IRT models: the idea that scoring is independent of the specific item set that was administered (local independence), and unidimensionality. For every cohort and every inventory separately, item parameters were estimated based on data from individuals without missing data. Such a set of parameter values for a particular sample of items assessed in a particular sample is termed a calibration. Calibrations were also obtained for combinations of item sets from various inventories, if there was a subsample of individuals that was assessed with those inventories. Based on these calibrations, (i.e., sets of item parameter values), latent scores can be estimated for those individuals for which one has either complete data or data with some missing values, assuming these are missing at random. In order to investigate local independence, latent scores for a particular item set (say, item scores for NEO-PI-R) were estimated and compared