paperKB
coga / coga-kb
Help
Sign in

Chunk #36 — ONLINE METHODS — Quality control

Source
Large-scale genotyping identifies 41 new loci associated with breast cancer risk.
Embedded
yes

Text

We excluded individuals for any of the following reasons: genotypically not female XX (XY, XXY or XO); overall call rate < 95%; low or high heterozygosity (P < 1 × 10−6, determined separately for individuals of European, East Asian and African-American ancestry); genotypes discordant with those determined in previous BCAC genotyping such that the individual appeared to be different; genotypes for the duplicate sample that seemed to be from a different individual; and cryptic duplicates where the phenotypic data indicated that the individuals were different. We searched for cryptic duplicates, both within each study and between studies from the same country. For known and cryptic concordant duplicates, the sample with the lower call rate was excluded. We attempted to identify first-degree relative pairs using identity-by-state estimates based on ~37,000 uncorrelated SNPs. For apparent first-degree relative pairs, we removed the control from a case-control pair; otherwise, we excluded the individual with the lower call rate. For the main analyses presented here, we also excluded 1,880 individuals who were included in any of the GWAS to allow the GWAS and iCOGS stages to be combined.