Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches.
- Authors
- Abel, Haley J; Duncavage, Eric J
- Year
- 2013
- Journal
- Cancer genetics
- PMID
- 24405614
- DOI
- 10.1016/j.cancergen.2013.11.002
- PMCID
- PMC4441822
Next generation sequencing (NGS), or massively paralleled sequencing, refers to a collective group of methods in which numerous sequencing reactions take place simultaneously, resulting in enormous amounts of sequencing data for a small fraction of the cost of Sanger sequencing. Typically short (50-250 bp), NGS reads are first mapped to a reference genome, and then variants are called from the mapped data. While most NGS applications focus on the detection of single nucleotide variants (SNVs) or small insertions/deletions (indels), structural variation, including translocations, larger indels, and copy number variation (CNV), can be identified from the same data. Structural variation detection can be performed from whole genome NGS data or "targeted" data including exomes or gene panels. However, while targeted sequencing greatly increases sequencing coverage or depth of particular genes, it may introduce biases in the data that require specialized informatic analyses. In the past several years, there have been considerable advances in methods used to detect structural variation, and a full range of variants from SNVs to balanced translocations to CNV can now be detected with reasonable sensitivity from either whole genome or targeted NGS data. Such methods are being rapidly applied to clinical testing where they can supplement or in some cases replace conventional fluorescence in situ hybridization or array-based testing. Here we review some of the informatics approaches used to detect structural variation from NGS data.
Identification of translocations from discordant paired-end reads. (A) In this example, a t(4;11) translocation is identified by discordant paired-end reads. Read pairs are first identified, in which one end maps to the targeted region (in this case the MLL gene on 11q23) and the other end maps to a different chromosome. (B) Discordant paired-end ead methods are subject to high false-positive rates due to sequence-mapping errors and repeat regions in the genome. Most translocation identification software employs filtering criteria to reduce the number of false-positive calls.
CNV by DOC analysis. In this example, CNV is called by first obtaining the DOC for every position in the targeted sequencing region. Next, DOC data must be normalized, which can be accomplished by a number of approaches, including comparing to paired normal samples (in the case of cancer), pooled normal controls, or the mean sample coverage. Once coverage is normalized, regions of constant CNV are identified, and CNV calls are then made using a variety of probabilistic models.
Methods for indel detection. (A) In this example, a small insertion is identified by alignment-based calling methods. Such insertions are generally identified during initial read mapping and alignment and are evaluated by indel detection programs using different models to exclude false-positive results due to sequencing or read mapping errors. (B) A medium-sized insertion is identified by split read mapping methods. In this example, an insertion (red) present in the sequenced DNA is detected by first identifying paired-end reads in which one end maps and the other (containing the inserted sequence) does not. The inserted sequence is reconstructed from the overlapping, unmapped single-end reads. (C) An insertion detected by paired-end methods. In this example, the sequenced DNA contains an insertion and read pairs mapping to the flanking normal reference sequence show a shorter than expected distance between ends, allowing for an insertion to be inferred.
No entities extracted from this document yet.
No uploaded files.
In this knowledge base
| Title | Year | PMID |
|---|---|---|
| GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing. | 2022 | 35953715 |
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Points to consider for the next-generation-sequencing-based detection of copy-number abnormalities (CNAs) and balanced chromosomal rearrangements in neoplastic disorders: A statement of the American College of Medical Genetics and Genomics (ACMG). | Schieffer KM et al. | β | 2026 | β |
| The 22q11.2 deletion syndrome: Genetic mechanisms, clinical manifestations, and therapeutic strategies. | Yang S et al. | β | 2026 | β |
| Biochemical and Genomic Underpinnings of Carotenoid Colour Variation Across a Hybrid Zone Between South Asian Flameback Woodpeckers. | Ranasinghe RW et al. | β | 2025 | β |
| Genomic Evaluation of AML-Main Techniques and Novel Approaches. | Yahya D et al. | β | 2025 | β |
| Methods, applications, and computational challenges in bait capture enrichment. | Bravo JE et al. | β | 2025 | β |
| NGS-based identification of a MYO7A mutation in a Korean family with DFNB2, a subtype of autosomal recessive non-syndromic hearing loss. | Kim YR et al. | β | 2025 | β |
| Precision Medicine in Hematologic Malignancies: Evolving Concepts and Clinical Applications. | Khoury R et al. | β | 2025 | β |
| Application of next-generation sequencing in the detection of transgenic crop. | He S et al. | β | 2024 | β |
| Comprehensive Analysis of Clinically Relevant Copy Number Alterations (CNAs) Using a 523-Gene Next-Generation Sequencing Panel and NxClinical Software in Solid Tumors. | Gupta V et al. | β | 2024 | β |
| Rett syndrome diagnostic odyssey: Limitations of NextGen sequencing. | Abbott M et al. | β | 2024 | β |
| cnnLSV: detecting structural variants by encoding long-read alignment information and convolutional neural network. | Ma H et al. | β | 2023 | β |
| Detecting genomic deletions from high-throughput sequence data with unsupervised learning. | Li X et al. | β | 2023 | β |
| Evaluation of the Oncomine Comprehensive Assay v3 panel for the detection of 1p/19q codeletion in oligodendroglial tumours. | Ali RH et al. | β | 2023 | β |
| Next-Generation Sequencing Technology: Current Trends and Advancements. | Satam H et al. | β | 2023 | β |
| A Custom DNA-Based NGS Panel for the Molecular Characterization of Patients With Diffuse Gliomas: Diagnostic and Therapeutic Applications. | TirrΓ² E et al. | β | 2022 | β |
| Acute myelogenous leukemia - current recommendations and approaches in molecular-genetic assessment. | Yahya D et al. | β | 2022 | β |
| Detection of Structural Variants by NGS: Revealing Missing Alleles in Lysosomal Storage Diseases. | La Cognata V et al. | β | 2022 | β |
| DNA methylation episignatures: insight into copy number variation. | van der Laan L et al. | β | 2022 | β |
| GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing. | Mathur R et al. | β | 2022 | β |
| Optimizing Insertion and Deletion Detection Using Next-Generation Sequencing in the Clinical Laboratory. | Craven KE et al. | β | 2022 | β |
| Principles and Validation of Bioinformatics Pipeline for Cancer Next-Generation Sequencing. | Roy S | β | 2022 | β |
| Sorghum Association PanelΒ whole-genome sequencing establishes cornerstone resource for dissecting genomic diversity. | Boatwright JL et al. | β | 2022 | β |
| Tool evaluation for the detection of variably sized indels from next generation whole genome and targeted sequencing data. | Wang N et al. | β | 2022 | β |
| Applying Bioinformatic Platforms, In Vitro, and In Vivo Functional Assays in the Characterization of Genetic Variants in the GH/IGF Pathway Affecting Growth and Development. | DomenΓ© S et al. | β | 2021 | β |
| Combining Nanopore and Illumina Sequencing Permits Detailed Analysis of Insertion Mutations and Structural Variations Produced by PEG-Mediated Transformation in <i>Ostreococcus tauri</i>. | Thomy J et al. | β | 2021 | β |
| Copy number variation analysis in Chinese children with complete atrioventricular canal and single ventricle. | Zhang X et al. | β | 2021 | β |
| Intronic Breakpoint Signatures Enhance Detection and Characterization of Clinically Relevant Germline Structural Variants. | van den Akker J et al. | β | 2021 | β |
| The Importance of Targeted Next-Generation Sequencing Usage in Cytogenetically Normal Myeloid Malignancies. | Atli EI et al. | β | 2021 | β |
| Comparative assessments of indel annotations in healthy and cancer genomes with next-generation sequencing data. | Chen J et al. | β | 2020 | β |
| Comparison of multiple algorithms to reliably detect structural variants in pears. | Liu Y et al. | β | 2020 | β |
| Copy number variantsΒ and fixed duplications among 198 rhesus macaques (Macaca mulatta). | BrasΓ³-Vives M et al. | β | 2020 | β |
| Evaluation of CNV detection tools for NGS panel data in genetic diagnostics. | Moreno-Cabrera JM et al. | β | 2020 | β |
| Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions. | Sethi R et al. | β | 2020 | β |
| Antitumor activity of larotrectinib in tumors harboring <i>NTRKΒ </i>gene fusions: a short review on the current evidence. | Ricciuti B et al. | β | 2019 | β |
| Characterization and evolutionary dynamics of complex regions in eukaryotic genomes. | Ranz J et al. | β | 2019 | β |
| Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software. | Cameron DL et al. | β | 2019 | β |
| Data Integration in Poplar: 'Omics Layers and Integration Strategies. | Weighill D et al. | β | 2019 | β |
| Identification of T-DNA Insertion Site and Flanking Sequence of a Genetically Modified Maize Event IE09S034 Using Next-Generation Sequencing Technology. | Siddique K et al. | β | 2019 | β |
| Influence of validating the parental origin on the clinical interpretation of fetal copy number variations in 141 core family cases. | Shi P et al. | β | 2019 | β |
| Location of Balanced Chromosome-Translocation Breakpoints by Long-Read Sequencing on the Oxford Nanopore Platform. | Hu L et al. | β | 2019 | β |
| Next generation sequencing for clinical diagnostics: Five year experience of an academic laboratory. | Hartman P et al. | β | 2019 | β |
| Next Generation Sequencing in Newborn Screening in the United Kingdom National Health Service. | van Campen JC et al. | β | 2019 | β |
| Paragraph: a graph-based structural variant genotyper for short-read sequence data. | Chen S et al. | β | 2019 | β |
| Preanalytic Variables and Tissue Stewardship for Reliable Next-Generation Sequencing (NGS) Clinical Analysis. | Ascierto PA et al. | β | 2019 | β |
| TSD: A Computational Tool To Study the Complex Structural Variants Using PacBio Targeted Sequencing Data. | Meng G et al. | β | 2019 | β |
| Applying Two Different Bioinformatic Approaches to Discover Novel Genes Associated with Hereditary Hearing Loss via Whole-Exome Sequencing: ENDEAVOUR and HomozygosityMapper. | Pourreza MR et al. | β | 2018 | β |
| Atypical Presentation of Bilateral Retinoblastoma with Floaters and Sub-Internal Limiting Membrane Seeds in an 11-Year-Old Asian Indian Male. | Rishi P et al. | β | 2018 | β |
| Copy Number Variations in Amyotrophic Lateral Sclerosis: Piecing the Mosaic Tiles Together through a Systems Biology Approach. | Morello G et al. | β | 2018 | β |
| Familial Cortical Myoclonic Tremor and Epilepsy, an Enigmatic Disorder: From Phenotypes to Pathophysiology and Genetics. A Systematic Review. | van den Ende T et al. | β | 2018 | β |
| Genome-Wide Analysis of Interchromosomal Interaction Probabilities Reveals Chained Translocations and Overrepresentation of Translocation Breakpoints in Genes in a Cutaneous T-Cell Lymphoma Cell Line. | Steininger A et al. | β | 2018 | β |
| Multicolor Fluorescence In Situ Hybridization (FISH) Approaches for Simultaneous Analysis of the Entire Human Genome. | Zhang C et al. | β | 2018 | β |
| Multiplex polymerase chain reaction in combination with gel electrophoresis-inductively coupled plasma mass spectrometry: A powerful tool for the determination of gene copy number variations and gene expression changes. | FernΓ‘ndez Asensio A et al. | β | 2018 | β |
| Next-generation sequencing approaches for the study of genome and epigenome toxicity induced by sulfur mustard. | Panahi Y et al. | β | 2018 | β |
| Prognostic and predictive biomarkers in breast cancer: Past, present and future. | Nicolini A et al. | β | 2018 | β |
| SEG - A Software Program for Finding Somatic Copy Number Alterations in Whole Genome Sequencing Data of Cancer. | Zhang M et al. | β | 2018 | β |
| Whole-genome sequencing analysis of CNV using low-coverage and paired-end strategies is efficient and outperforms array-based CNV analysis. | Zhou B et al. | β | 2018 | β |
| A Path to Implement Precision Child Health Cardiovascular Medicine. | Touma M et al. | β | 2017 | β |
| Calling Chromosome Alterations, DNA Methylation Statuses, and Mutations in Tumors by Simple Targeted Next-Generation Sequencing: A Solution for Transferring Integrated Pangenomic Studies into Routine Practice? | Garinet S et al. | β | 2017 | β |
| Copy number variation and disease resistance in plants. | Dolatabadian A et al. | β | 2017 | β |
| digit-a tool for detection and identification of genomic interchromosomal translocations. | Meier R et al. | β | 2017 | β |
| Early experience with formalin-fixed paraffin-embedded (FFPE) based commercial clinical genomic profiling of gliomas-robust and informative with caveats. | Movassaghi M et al. | β | 2017 | β |
| Goat domestication and breeding: a jigsaw of historical, biological and molecular data with missing pieces. | Amills M et al. | β | 2017 | β |
| Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis. | Guo Y et al. | β | 2017 | β |
| Linked read sequencing resolves complex genomic rearrangements in gastric cancer metastases. | Greer SU et al. | β | 2017 | β |
| Massive parallel sequencing of solid tumours - challenges and opportunities for pathologists. | Harris G et al. | β | 2017 | β |
| Multi-gene panel testing for hereditary cancer predisposition in unsolved high-risk breast and ovarian cancer patients. | Crawford B et al. | β | 2017 | β |
| The challenge of detecting indels in bacterial genomes from short-read sequencing data. | Steglich M et al. | β | 2017 | β |
| The role of genomics in common variable immunodeficiency disorders. | Kienzler AK et al. | β | 2017 | β |
| Whole-genome sequencing analysis of copy number variation (CNV) using low-coverage and paired-end strategies is efficient and outperforms array-based CNV analysis | Zhou B et al. | β | 2017 | β |
| Copy number changes of clinically actionable genes in melanoma, non-small cell lung cancer and colorectal cancer-A survey across 822 routine diagnostic cases. | Pfarr N et al. | β | 2016 | β |
| Development and validation of a comprehensive genomic diagnostic tool for myeloid malignancies. | McKerrell T et al. | β | 2016 | β |
| Genetic Evaluation of Schizophrenia Using the Illumina HumanExome Chip. | Moons T et al. | β | 2016 | β |
| Guide and Position of the International Society of Nutrigenetics/Nutrigenomics on Personalised Nutrition: Part 1 - Fields of Precision Nutrition. | Ferguson LR et al. | β | 2016 | β |
| Guide and Position of the International Society of Nutrigenetics/Nutrigenomics on Personalized Nutrition: Part 2 - Ethics, Challenges and Endeavors of Precision Nutrition. | Kohlmeier M et al. | β | 2016 | β |
| Identifying micro-inversions using high-throughput sequencing reads. | He F et al. | β | 2016 | β |
| Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach. | Guttikonda SK et al. | β | 2016 | β |
| Next-Generation Molecular Testing of Newborn Dried Blood Spots for Cystic Fibrosis. | Lefterova MI et al. | β | 2016 | β |
| Next-Generation Sequencing-Based Approaches for Mutation Mapping and Identification in Caenorhabditis elegans. | Doitsidou M et al. | β | 2016 | β |
| Rational search for genes in familial cortical myoclonic tremor with epilepsy, clues from recent advances. | Cen ZD et al. | β | 2016 | β |
| Representing genetic variation with synthetic DNA standards. | Deveson IW et al. | β | 2016 | β |
| A new generation of cancer genome diagnostics for routine clinical use: overcoming the roadblocks to personalized cancer medicine. | Heuckmann JM et al. | β | 2015 | β |
| Clinical implications of copy number variations in autoimmune disorders. | Yim SH et al. | β | 2015 | β |
| Clinical next-generation sequencing in patients with non-small cell lung cancer. | Hagemann IS et al. | β | 2015 | β |
| Function of cancer associated genes revealed by modern univariate and multivariate association tests. | Gorfine M et al. | β | 2015 | β |
| High-throughput sequencing in mutation detection: A new generation of genotoxicity tests? | Maslov AY et al. | β | 2015 | β |
| Molecular diagnostics in soft tissue sarcomas and gastrointestinal stromal tumors. | Smith SM et al. | β | 2015 | β |
| Performance evaluation of indel calling tools using real short-read data. | Hasan MS et al. | β | 2015 | β |
| ScanIndel: a hybrid framework for indel detection via gapped alignment, split reads and de novo assembly. | Yang R et al. | β | 2015 | β |
| Whole-genome CNV analysis: advances in computational approaches. | Pirooznia M et al. | β | 2015 | β |
| Detection of gene rearrangements in targeted clinical next-generation sequencing. | Abel HJ et al. | β | 2014 | β |
| Precise detection of chromosomal translocation or inversion breakpoints by whole-genome sequencing. | Suzuki T et al. | β | 2014 | β |