Chunk #3 — Materials and methods — Subjects, genotyping, and imputation

Source: A regulatory variant of CHRM3 is associated with cannabis-induced hallucinations in European Americans.
Embedded: yes

Text

in each cohort was carried out using PLINK1.97 based on the following criteria: (1) individual genotype missing rate < 2%, (2) SNP genotype missing rate < 2%, (3) Hardy–Weinberg P > 1 × 10−6, and (4) minor allele frequency (MAF) > 3%. After QC, samples from Yale-Penn 1 and 2 were subjected to ancestry analysis by comparison with the 1000 Genomes Project phase 1 reference panel8. Eigensoft9 was used for principal components (PCs) analysis with the first ten PC scores serving to differentiate EAs and AAs through K-means clustering10. For each Yale-Penn cohort, SNPs from EAs and AAs were imputed together using Minimac3 implemented in the Michigan Imputation Server11 with the 1000 Genomes phase 3 reference panel. We transformed dosage data into best-estimate genotypes using PLINK1.9, retaining high-quality genotyping data by filtering imputed data with genotype imputation probability (GP) ≥ 0.9; the resulting genotype data were transformed into plink binary format data, which can be used directly in association tests with the GWAS software, Genome-wide Efficient Mixed Model Association (GEMMA)12. After retaining genotypes with GP ≥ 0.9, individual genotyping missing rate < 5%, MAF > 3%, and missing call frequency < 5%, there were 8,200,853 and 5,916,265 remaining variants for