paperKB
coga / coga-kb
Help
Sign in

Chunk #6 — METHODS — Statistical Analyses — Data Cleaning

Source
ANKK1, TTC12, and NCAM1 polymorphisms and heroin dependence: importance of considering drug exposure.
Embedded
yes

Text

Details of data cleaning have been reported previously.27 In brief, SNPs were excluded owing to genotyping failure (23 SNPs), call rate less than 95% (9 SNPs), minor allele frequency less than 2% (47 SNPs), and Hardy-Weinberg equilibrium deviations (27 SNPs); 1430 SNPs were retained for analyses (Supplemental Table 2 shows the complete list). The mean call rate for 1294 non-opioid receptor SNPs remaining after data cleaning exceeded 99.9%. Samples of DNA from 1506 cases, 538 neighborhood controls, and 1500 ATR controls were genotyped. Data from samples were excluded owing to genotyping failure (1 ATR control), phenotypic-genotypic gender mismatch (1 case; 2 neighborhood controls), duplication due to participation in the project multiple times (29 cases; 3 neighborhood controls – phenotypic data from the most recent, non-pilot study interview were retained), and cryptic relatedness with identity by descent greater than 0.5 (17 cases, 4 neighborhood controls, and 4 ATR controls – individuals with the higher project identifier were excluded). The sample used for analyses consisted of 1459 cases, 531 neighborhood controls, and 1495 ATR controls.