Limitations of the current study are that we did not include all items in cases of repeated measures, item data were assumed to be missing at random (Little and Rubin 1989), and we preselected items to belong to Neuroticism or Extraversion, rather than making this choice data-driven. Future extensions of the IRT linking approach may address these issues. Also note that our method deals with harmonization of continuously distributed data. Generally, harmonization of case–control status requires a different approach, but in cases where diagnosis is based on cut-off scores on continuous measures (e.g. a symptom count), the application of IRT models could prove helpful; IRT models are also used to compare pass/fail decisions in educational measurement where students are differentially assessed.