paperKB
coga / coga-kb
Help
Sign in

Chunk #70 — Methods — Constructing Repository PGIs — Subject-level QC in Repository Cohorts

Source
Resource profile and user guide of the Polygenic Index Repository.
Embedded
yes

Text

We restricted the samples to individuals with European ancestries. Exclusion criteria were based on the first four principal components of the genetic data. In order to obtain the principal components, for each cohort, we first converted the imputed genotype dosages for HapMap3 SNPs into hard calls. We then merged the data with all samples from the third phase of the 1000 Genomes Project, restricting to SNPs that had a call rate greater than 99% and minor allele frequency greater than 1% in the merged sample. We calculated the principal components (PCs) in the 1000 Genomes subsample and projected these onto the remaining individuals in the merged data. In order to select European-ancestry samples, we plotted the first four PCs against each other and visually identified the individuals that cluster together with the 1000 Genomes EUR sample.