Chunk #11 — Methods and procedure — Statistical analyses

Source: Genome-wide association study of recurrent early-onset major depressive disorder.
Embedded: yes

Text

Association between single genotyped SNPs and case-control status was tested with logistic regression (trend test) using PLINK. Genotypic dosages (the estimated number of test alleles) were imputed for all HapMap II SNPs with MACH 1.0 software25 for autosomal SNPs and with IMPUTE26 for X chromosome SNPs, using a Hidden Markov Model algorithm and a training dataset consisting of phased HapMap CEU haplotypes. This provided an additional 1,892,186 SNPs (1,849,062 autosomal and 43,124 X chromosome SNPs) for testing in addition to the genotyped SNPs, after filtering for MAF > 1% and imputation r2 > 0.3 (an estimate of expected agreement between imputed and actual genotypes). This threshold was used in four previous GWAS meta-analyses because it removed most poorly-imputed SNPs but few well-imputed SNPs.27–30 Association tests for imputed SNPs were carried out with local software using the same logistic regression model. For all tests, ancestry-informative principal components were included as covariates.24 Each SNP was tested for all subjects, and then separately for males and for females. For the primary analysis of all subjects, a reasonable threshold for 5% genome-wide significance is