Appropriate data cleaning methods for genome-wide association study.
- Authors
- Miyagawa, Taku; Nishida, Nao; Ohashi, Jun; Kimura, Ryosuke; Fujimoto, Akihiro; Kawashima, Minae; Koike, Asako; Sasaki, Tsukasa; Tanii, Hisashi; Otowa, Takeshi; Momose, Yoshio; Nakahara, Yasuo; Gotoh, Jun; Okazaki, Yuji; Tsuji, Shoji; Tokunaga, Katsushi
- Year
- 2008
- Journal
- Journal of human genetics
- PMID
- 18695938
- DOI
- 10.1007/s10038-008-0322-y
Genome-wide association studies (GWAS) using a large number of single nucleotide polymorphisms (SNPs) have successfully been applied to identify genetic variants of common diseases. However, genotyping using the new array technologies is often associated with spurious results that could unfavorably affect analyses of GWAS. Consequently, data cleaning is of paramount importance in excluding spurious genotyping results. In this study, we investigated the criteria required for the appropriate cleaning of 389 unrelated healthy Japanese samples analyzed using the GeneChip Human Mapping 500K Array Set for GWAS. The samples were randomly subdivided into two groups, and the allele frequencies in the groups were compared for individual SNPs as a quasi-case-control study. Then, observed results were filtered by four parameters (SNP call rate, confidence score obtained using the Bayesian Robust Linear Model with Mahalanobis genotype-calling algorithm, Hardy-Weinberg equilibrium, and minor allele frequency) and assessed for deviation from the null hypothesis. We found that appropriate data cleaning could be achieved using these four parameters. Our findings offer an avenue for obtaining appropriate data from GWAS.
No figures extracted from this document.
No chunks β full text not yet ingested.
No entities extracted from this document yet.
No uploaded files.
No citations found.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Population Structure of Modern Winter Wheat Accessions from Central Asia. | Amalova A et al. | β | 2023 | β |
| Democratizing clinical-genomic data: How federated platforms can promote benefits sharing in genomics. | Alvarellos M et al. | β | 2022 | β |
| Genome-wide association study of idiopathic hypersomnia in a Japanese population. | Tanida K et al. | β | 2022 | β |
| Understanding Mendelian errors in SNP arrays data using a Gochu Asturcelta pig pedigree: genomic alterations, family size and calling errors. | Arias KD et al. | β | 2022 | β |
| Unraveling Genomic Regions Controlling Root Traits as a Function of Nitrogen Availability in the MAGIC Wheat Population WM-800. | Schmidt L et al. | β | 2022 | β |
| Evaluating Polygenic Risk Scores in "Lone" Atrial Fibrillation. | Lazarte J et al. | β | 2021 | β |
| Genome-wide association study of yield components in spring wheat collection harvested under two water regimes in Northern Kazakhstan. | Amalova A et al. | β | 2021 | β |
| Genome-wide association mapping for resistance to leaf, stem, and yellow rusts of common wheat under field conditions of South Kazakhstan. | Genievskaya Y et al. | β | 2020 | β |
| Genome-Wide Association Study of Metamizole-Induced Agranulocytosis in European Populations. | Cismaru AL et al. | β | 2020 | β |
| Identification of hybridization and introgression between Cinnamomum kanehirae Hayata and C. camphora (L.) Presl using genotyping-by-sequencing. | Wu CC et al. | β | 2020 | β |
| Evaluation of genotyping concordance for commercial bovine SNP arrays using quality-assurance samples. | Wu XL et al. | β | 2019 | β |
| Population Genetics Revealed a New Locus That Underwent Positive Selection in Barley. | Reinert S et al. | β | 2019 | β |
| Adaptive selection of founder segments and epistatic control of plant height in the MAGIC winter wheat population WM-800. | Sannemann W et al. | β | 2018 | β |
| Marker-trait associations in two-rowed spring barley accessions from Kazakhstan and the USA. | Genievskaya Y et al. | β | 2018 | β |
| Genome-wide association studies identify PRKCB as a novel genetic susceptibility locus for primary biliary cholangitis in the Japanese population. | Kawashima M et al. | β | 2017 | β |
| Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. | Thottakkara P et al. | β | 2016 | β |
| Genome-Wide Association Mapping in the Global Diversity Set Reveals New QTL Controlling Root System and Related Shoot Variation in Barley. | Reinert S et al. | β | 2016 | β |
| Understanding of HLA-conferred susceptibility to chronic hepatitis B infection requires HLA genotyping-based association analysis. | Nishida N et al. | β | 2016 | β |
| Genome-wide association study for crown rust (Puccinia coronata f. sp. avenae) and powdery mildew (Blumeria graminis f. sp. avenae) resistance in an oat (Avena sativa) collection of commercial varieties and landraces. | Montilla-BascΓ³n G et al. | β | 2015 | β |
| Genome-wide association study confirming association of HLA-DP with protection against chronic hepatitis B and viral clearance in Japanese and Korean. | Nishida N et al. | β | 2012 | β |
| Genome-wide association study for oat (Avena sativa L.) beta-glucan concentration using germplasm of worldwide origin. | Newell MA et al. | β | 2012 | β |
| Genome-wide association study identifies TNFSF15 and POU2AF1 as susceptibility loci for primary biliary cirrhosis in the Japanese population. | Nakamura M et al. | β | 2012 | β |
| IPGWAS: an integrated pipeline for rational quality control and association analysis of genome-wide genetic studies. | Fan YH et al. | β | 2012 | β |
| A genome-wide CNV association study on panic disorder in a Japanese population. | Kawamura Y et al. | β | 2011 | β |
| Quality control procedures for genome-wide association studies. | Turner S et al. | β | 2011 | β |
| A quality control algorithm for filtering SNPs in genome-wide association studies. | Pongpanich M et al. | β | 2010 | β |
| Batch effects in the BRLMM genotype calling algorithm influence GWAS results for the Affymetrix 500K array. | Miclaus K et al. | β | 2010 | β |
| Quality control and quality assurance in genotypic data for genome-wide association studies. | Laurie CC et al. | β | 2010 | β |
| Replication of a genome-wide association study of panic disorder in a Japanese population. | Otowa T et al. | β | 2010 | β |
| Statistical genetic issues for genome-wide association studies. | Weir BS | β | 2010 | β |
| The Gene, Environment Association Studies consortium (GENEVA): maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions. | Cornelis MC et al. | β | 2010 | β |
| Genetic variations as cancer prognostic markers: review and update. | Savas S et al. | β | 2009 | β |
| Genome-wide association of IL28B with response to pegylated interferon-alpha and ribavirin therapy for chronic hepatitis C. | Tanaka Y et al. | β | 2009 | β |
| Genome-wide association study of panic disorder in the Japanese population. | Otowa T et al. | β | 2009 | β |
| Evaluating the performance of Affymetrix SNP Array 6.0 platform with 400 Japanese individuals. | Nishida N et al. | β | 2008 | β |