Genomes Project. While finding a very high concordance (~95%) between self-reports and extended-PCA defined racial/ethnic groups, the findings indicate that empirical clustering methods provide an incremental increase in racial/ethnic group homogeneity and reduce marker loss and sample loss due to ‘unknown’ race/ethnicity. While this is a ‘methods’ paper, it should be kept in mind that other such techniques exist and others will likely be developed as newer analytic methods emerge. The use of PCA derived ethnicity phenotypes has already been applied to data sets from a variety of diseases in admixed populations, although rarely in psychiatric/addiction samples.