paperKB
coga / coga-kb
Help
Sign in

Chunk #3 — Online methods — Genotype quality control

Source
Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes.
Embedded
yes

Text

Identity by Descent (IBD) was calculated between each pair of samples in the data using PLINK to detect unexpected duplicates and relatedness. Details are described in Supplementary Note. 340 unexpected duplicated samples (proportion IBD>0.95) and 940 individuals were removed as related samples with proportion IBD between 0.45 and 0.95. Of these, 721 of them were expected first degree relatives. In total, 0.56% of the total samples were removed as unexpected duplicates or relatives in the QC analysis. We additionally considered the potential that more distant familial relationships could have impacted the results. However, further restriction to proportion IBD > 0.2 identified 139 second degree relatives and excluding these had minimal impact on the association results (Supplementary Note Table 1).