Chunk #14 — Methods — Statistical analysis

Source: Gene-based polygenic risk scores analysis of alcohol use disorder in African Americans.
Embedded: yes

Text

As COGA, SAGE, and YalePenn include related individuals, generalized linear mixed models were used with a random effect to adjust for family relationships. For Indiana Biobank, which is a cohort of unrelated individuals, logistic regression models were used. We also stratified individuals based on PRS deciles and compared each to the bottom decile. Since the sample sizes in COGA, SAGE, and YalePenn had insufficient sample sizes in each decile, we combined all three target datasets for the stratified analyses. For all models, sex and the first 10 PCs were included as covariates. For the combined analysis of COGA, SAGE, and YalePenn data, we also included the cohort indicator as an additional covariate. Associations with P-values < 0.05 across all three target datasets were considered statistically significant for PRSgene, PRSintegenic, and PRSall, respectively.