After the factor structure of the set of DSM-IV items was investigated, Item Response Theory (IRT) models were used to study the latent alcohol use disorder construct separately in probability samples of ED patients from four different countries in an effort to replicate the finding that there is an underlying common structure[15] and, if so, whether individual heavy drinking measures help to improve alcohol use disorder criteria by adding a criterion in the low-medium spectrum of the disorder, applicable to all four countries. IRT analysis implemented in Mplus[21] was used to derive two main parameters, the threshold and the discrimination parameters. The first refers to the “severity” of a criterion (threshold), with high severity being those less frequently endorsed by respondents. The second parameter measures the ability of a criterion to discriminate respondents from low to high levels of the disorder continuum (slope). Graphical aids and plots of both parameters were used. Finally, differential item functioning (DIF) was performed in the PARSCALE[22] to test whether the probabilities of responding in different categories of consumption differed by population for the same