Chunk #65 — Methods — Statistical analyses — Polygenic risk score analyses

Source: Genome-wide association study implicates CHRNA2 in cannabis use disorder.
Embedded: yes

Text

PRS analyses were done using GWAS summary statistics from 22 GWASs (Supplementary Table 11). The summary files were downloaded from public databases and processed using the munge script which is a part of the LDscore regression software74. All variants with INFO < 0.9, MAF < 0.01, missing values, out of bounds P-values, ambiguous strand alleles and duplicated rs-ids were removed using the munge script. In addition, mult-allelic variants and insertion and deletion (indels) were removed. The processed summary files were then LD-clumped using Plink, with the following parameter settings: --clump-p1 1 --clump-p2 1 --clump-r2 0.1 --clump-kb 500. The clumped file were used as the training dataset. Genetic risk scores were estimated at different P-value thresholds for SNP inclusion: 5x10−8, 1x10−6, 1x10−4, 1x10−3, 0.01, 0.05, 0.1, 0.2, 0.5 and 1.0 for all individuals in the target sample (CUD cases and the control group) from the genotype dosages using Plink’s ‘--score’ method, with default arguments. However the PRS scores for ADHD, were generated using the approach described Demontis et al.33. For each P-value threshold the variance in the phenotype explained by PRS