paperKB
coga / coga-kb
Help
Sign in

Chunk #9 — MATERIALS AND METHODS — Genotyping and Quality Control

Source
Genome-wide association study of cocaine dependence and related traits: FAM53B identified as a risk gene.
Embedded
yes

Text

To verify and correct misclassification of self-reported race, we compared the GWAS data from all subjects with the genotypes from the HapMap 3 reference CEU, YRI, and CHB populations. Principal components (PCs) analysis was conducted in the discovery GWAS sample using Eigensoft13–14 and 145,472 SNPs that were common to the GWAS dataset and HapMap panel (after pruning the GWAS SNPs for linkage disequilibrium (r2)>80%) in each sample to characterize the underlying genetic architecture by deriving 10 PCs for each individual. The PCs were used to distinguish EAs from AAs by a K-means (K=2) clustering algorithm15 and the two groups were analyzed separately. Because many subjects self-identified as EA Hispanic or AA Hispanic, PC analyses were repeated within the AA and EA groups, and the first three PCs in each were used in all subsequent analyses to correct for residual population stratification within the group.7