Chunk #20 — Materials and Methods — Data Analysis — Polygenic score creation

Source: Polygenic Risk Score Prediction of Alcohol Dependence Symptoms Across Population-Based and Clinically Ascertained Samples.
Embedded: yes

Text

Using PRSice, the list of common SNPs was pruned based on linkage disequilibrium (LD) to obtain a set of autosomal SNPs in approximate linkage equilibrium (R2 < .10) for each discovery-validation pair with a sliding 250kb window. PLINK’s clump procedure was used to prioritize the selection of SNPs with stronger association signals in the discovery GWAS to index these LD blocks, in order to enhance the predictive ability of the scores. PLINK’s score method then summed the total number of minor alleles from the set of score SNPs for each individual in the validation sample, weighting each score SNP by the magnitude and sign of its GWAS association statistic (c.f. The International Schizophrenia Consortium, 2009). For each discovery-validation sample pair, this list of score SNPs was further filtered based on association p value thresholds in the discovery sample to create a series of SNP sets with decreasing stringency of nominal GWAS association (thresholds of p < .001 to p < .50). PRSice implements this in a high-resolution fashion in p value increments of 0.01 and selects the threshold yielding scores with the strongest phenotypic prediction in the validation sample.