paperKB
coga / coga-kb
Help
Sign in

Chunk #17 — Methods — Within group quality control

Source
Molecular Genetic Influences on Normative and Problematic Alcohol Use in a Population-Based Sample of College Students.
Embedded
yes

Text

Due to the diverse nature of S4S, filtering by Hardy-Weinberg Equilibrium (HWE), minor allele frequency (MAF), and relatedness were performed within empirically assigned super-populations. Genome-wide IBD (Π^) was calculated using PLINK 1.9. For each sample, the mean cross-sample Π^ was calculated to find samples showing excessive relatedness, which is where a sample appears to be a cryptic relative to many other samples but those samples do not appear related to one another. One hundred and ninety four samples were excluded (>2.5 standard deviations above the mean) as outliers for average relatedness with all other samples. Clusters of probable relatives were defined using Π^ > 0.1, Z0 >= 0.825, and Z1 < 0.175. The inclusion of Z0/Z1 is important since Π^ > 0.1 can be due to artifacts where Z2 > 0 which is extremely unlikely for cryptic relatives. Then the best performing sample for each relative cluster was retained which resulted in an additional 180 samples being excluded from the GWAS sample.