Dataset properties have great importance in terms of PGS estimability and generalizability. The choice of dataset is especially crucial when the objective is to identify high-risk and low-risk individuals from general populations. Cohorts ascertained to study AUD aim to recruit more cases; thus, large percentages of participants have a higher PGS. Consequently, the PGS exhibits high estimability within these cohorts but lower estimability in less specially recruited cohorts. Additionally, because many of these cohorts aim to recruit controls that are otherwise similar to cases, many controls likely exhibit alcohol use problems but they are not severe enough for an AUD diagnosis. Furthermore, some controls do not have AUD but may be experiencing other substance use disorders. Including these individuals as controls reduces estimability of the PGS. In our study, we used AUD cohorts for screening. AUD status was determined using DSM-IV or DSM-5 criteria, hence enhancing diagnostic precision. Moreover, we excluded individuals with alcohol use problems but not meeting AUD criteria and other substance use disorders from the control group, thereby increasing statistical power. All of these approaches maximize our