SNPs from primary analyses in the family-based portion of the study may not have replicated in independent COGA and SAGE GWAS individuals due to sampling differences between the GWAS samples and the family-based association sample. One possibility is that the high-density family-based sample may be more severely affected than a case-control sample and therefore show differences in underlying genetic etiology. Mean DSM-IV symptom counts for AD were similar across the COGA high-density family-based sample, (mean = 5.26, SD = 1.48), and the SAGE (mean = 4.87, SD = 1.51) and COGA GWAS samples (mean = 5.56, SD = 1.43); however, severity of alcohol dependence may differ in ways beyond criterion count, such as the severity of the symptoms themselves, including the extent of tolerance and withdrawal, duration of symptoms, and number of episodes. We combined the COGA and SAGE samples before performing subsampling in order to create samples with similar population structure across discovery and validation sets.