Chunk #19 — Materials and Methods — Statistical analysis

Source: A genome-wide association study of neuroticism in a population-based sample.
Embedded: yes

Text

In order to investigate any possible population structure or cryptic relatedness among our study subjects, we selected a subset of genotyped SNPs to produce a kinship matrix and perform Principal Component Analysis (PCA). The SNPs were chosen using PLINK [36] to have minor allele fraction >5% and pairwise R2≤0.5, in a window size of 50 SNPs sliding 5 SNPs at a time. This process chose 75,046 SNPs for kinship and PCA analyses. For these analyses, missing genotypes were replaced with the mean genotype score for that SNP. The kinship matrix is K = XXT/2n, where n is the number of SNPs and X is a matrix with rows corresponding to individuals, while each column contains the genotype score for a SNP (i.e. the number of copies of minor allele, 0, 1 or 2), standardised to mean zero and unit variance. The (i, j) element of K is the excess allele sharing for alleles drawn from individuals i and j, beyond that expected for unrelated individuals given the allele fraction estimates [37]. The eigenvectors of K are the principal components, used