Three independent datasets from the database of Genotypes and Phenotypes (dbGaP) were used to replicate significant findings from primary, secondary and tertiary analyses: Study of Addiction: Genetics and Environment (non-overlapping individuals from SAGE, phs000092.v1.p1, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1), Alcohol Dependence GWAS in European and African Americans (Yale-Penn, phs000425.v1.p1, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000425.v1.p1), and the Australian Twin-family Study of Alcohol Use Disorder (OZALC, phs000181.v1.p1, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000181.v1.p1). Genotypic data from these samples were combined with genotypic data from the COGA samples to identify identical individuals across all datasets; overlapping subjects were retained in the discovery GWAS in COGA but excluded from the replication samples. Ancestry in the combined replication sample was determined in a manner similar to COGA. A similar definition of AD was employed where unaffected individuals with alcohol abuse, or other substance dependence were excluded. The secondary (DSM-IV AD criterion count) and tertiary (individual criteria) phenotypes were also coded in an identical manner. In each replication attempt, only the identical phenotype was tested in the replication cohort (e.g., for a variant that was GWS for one criterion but not others, only association with that criterion was tested