Chunk #64 — METHODS — Relatedness pruning and removal of ancestry outliers

Source: Depression pathophysiology, risk prediction of recurrence and comorbid psychiatric disorders using genome-wide analyses.
Embedded: yes

Text

Best guess genotypes from iPSYCH2012 and iPSYCH2015i were merged, filtered and LD-pruned down to a set of roughly 30K markers, with imputation INFO score >0.8, r2<0.075, located outside regions of long-range LD as defined by Price et al.110, minor allele frequency >0.05 and no deviation from Hardy-Weinberg proportions (P>1x10−4). Relatedness coefficients, based on “identity-by-state”, were estimated using plink v1.9, to identify related (and duplicated samples), with π^>0.2 and removing related individuals at random, but preferring cases over controls. PCA was carried out using the same set of filtered and LD-pruned SNPs as implemented in the Ricopili-pipeline104. A subsample of European ancestry was selected as an ellipsoid in the space of PC1-3 centered and scaled using the mean and 8 standard deviation of the subsample whose grandparents were all known to be born in Denmark.