Chunk #44 — Methods — Kinship coefficient estimation

Source: The UK Biobank resource with deep phenotyping and genomic data.
Embedded: yes

Text

We used an estimator implemented in the software, KING41, as it is robust to population structure (that is, does not rely on accurate estimates of population allele frequencies) and it is implemented in an algorithm efficient enough to consider all pairs (~1.2 × 1011) in a practicable amount of time. As noted by the authors of KING, we found that recent admixture (for example, ‘mixed’ ancestral backgrounds) tended to inflate the estimate of the kinship coefficient, as the estimator assumes Hardy–Weinberg equilibrium among markers with the same underlying allele frequencies within an individual. We alleviated this effect by only using a subset of markers that are only weakly informative of ancestral background (Supplementary Information, Supplementary Fig. 12). We also excluded a small fraction of individuals (977) from the kinship estimation, as they had properties (for example, high missing rates) that would lead to unreliable kinship estimates (Supplementary Information). We called relationship classes for each related pair using the kinship coefficient and fraction of markers for which they share no alleles (IBS0). See Supplementary Information section S3.7 for details.