Chunk #12 — Methods — Association analyses

Source: Genome-wide and candidate gene association study of cigarette smoking behaviors.
Embedded: yes

Text

Continuous phenotypes were log transformed to achieve approximate normality and SNP genotypes were coded as counts of minor alleles. For each study, we defined any phenotype that was more than three standard deviations from the mean to be an outlier. Outliers that were above (below) the mean were then truncated to the 99th (1st) percentile of the raw distribution. We tested for association between each SNP marker and each continuous phenotype using linear regression, adjusted for study center (PLCO) or geographic region (NHS); age at smoking assessment in five-year bins (baseline for PLCO or last available follow-up for NHS); marital status (married versus not; PLCO) or living arrangement (living alone or with others, NHS); education (4 categories PLCO, 3 categories NHS); prostate (PLCO) or breast (NHS) cancer case-control status; and selected principal components of genetic variation. For binary traits, we used unconditional logistic regression, adjusted for the same covariates. These tests were conducted separately for PLCO and NHS. For SNPs that passed QC filters and had minor allele frequency above 1% in both studies, we combined evidence for association across