ARDC Research Link Australia

Publication

Fast nanopore sequencing data analysis with SLOW5

Publisher: Springer Science and Business Media LLC

Date: 03-01-2022

Abstract: Nanopore sequencing depends on the FAST5 file format, which does not allow efficient parallel analysis. Here we introduce SLOW5, an alternative format engineered for efficient parallelization and acceleration of nanopore data analysis. Using the ex le of DNA methylation profiling of a human genome, analysis runtime is reduced from more than two weeks to approximately 10.5 h on a typical high-performance computer. SLOW5 is approximately 25% smaller than FAST5 and delivers consistent improvements on different computer architectures.

Publication

SquiggleKit: a toolkit for manipulating nanopore signal data

Publisher: Oxford University Press (OUP)

Date: 23-07-2019

DOI: 10.1093/BIOINFORMATICS/BTZ586

Abstract: The management of raw nanopore sequencing data poses a challenge that must be overcome to facilitate the creation of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualization and signal processing. SquiggleKit is cross platform and freely available from GitHub at (github.com/Psy-Fer/SquiggleKit). Detailed documentation can be found at (psy-fer.github.io/SquiggleKitDocs/). All tools have been designed to operate in python 2.7+, with minimal additional libraries. Supplementary data are available at Bioinformatics online.

Publication

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Publisher: Oxford University Press (OUP)

Date: 15-12-2021

DOI: 10.1093/BIOINFORMATICS/BTAB846

Abstract: InterARTIC is an interactive web application for the analysis of viral whole-genome sequencing (WGS) data generated on Oxford Nanopore Technologies (ONT) devices. A graphical interface enables users with no bioinformatics expertise to analyze WGS experiments and reconstruct consensus genome sequences from in idual isolates of viruses, such as SARS-CoV-2. InterARTIC is intended to facilitate widespread adoption and standardization of ONT sequencing for viral surveillance and molecular epidemiology. We demonstrate the use of InterARTIC for the analysis of ONT viral WGS data from SARS-CoV-2 and Ebola virus, using a laptop computer or the internal computer on an ONT GridION sequencing device. We showcase the intuitive graphical interface, workflow customization capabilities and job-scheduling system that facilitate execution of small- and large-scale WGS projects on any common virus. InterARTIC is a free, open-source web application implemented in Python that executes best-practice command line workflows from the ARTIC network. The application can be downloaded as a set of pre-compiled binaries that are compatible with all common Linux distributions, Windows with Linux subsystems, MacOSX and ARM systems. All code can be found on GitHub at github.com/Psy-Fer/interARTIC/ and documentation can be found at github.com/Psy-Fer/interARTIC/. Supplementary data are available at Bioinformatics online.

Publication

Barcoding and demultiplexing Oxford Nanopore native RNA sequencing reads with deep residual learning

Publisher: Cold Spring Harbor Laboratory

Date: 04-12-2019

DOI: 10.1101/864322

Abstract: Nanopore sequencing has enabled sequencing of native RNA molecules without conversion to cDNA, thus opening the gates to a new era for the unbiased study of RNA biology. However, a formal barcoding protocol for direct sequencing of native RNA molecules is currently lacking, limiting the efficient processing of multiple s les in the same flowcell. A major limitation for the development of barcoding protocols for direct RNA sequencing is the error rate introduced during the base-calling process, especially towards the 5’ and 3’ ends of reads, which complicates sequence-based barcode demultiplexing. Here, we propose a novel strategy to barcode and demultiplex direct RNA sequencing nanopore data, which does not rely on base-calling or additional library preparation steps. Specifically, custom DNA oligonucleotides are ligated to RNA transcripts during library preparation. Then, raw current signal corresponding to the DNA barcode is extracted and transformed into an array of pixels, which is used to determine the underlying barcode using a deep convolutional neural network classifier. Our method, DeePlexiCon , implements a 20-layer residual neural network model that can demultiplex 93% of the reads with 95.1% specificity, or 60% of reads with 99.9% specificity. The availability of an efficient and simple barcoding strategy for native RNA sequencing will enhance the use of direct RNA sequencing by making it more cost-effective to the entire community. Moreover, it will facilitate the applicability of direct RNA sequencing to s les where the RNA amounts are limited, such as patient-derived s les.

Publication

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Publisher: Research Square Platform LLC

Date: 28-12-2020

DOI: 10.21203/RS.3.RS-135125/V1

Abstract: Background Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Results Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. Conclusions The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Publication

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Publisher: Cold Spring Harbor Laboratory

Date: 11-11-2020

DOI: 10.1101/2020.11.11.379073

Abstract: Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Publication

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Publisher: Cold Spring Harbor Laboratory

Date: 24-09-2018

DOI: 10.1101/424945

Abstract: High-throughput single-cell RNA-Sequencing is a powerful technique for gene expression profiling of complex and heterogeneous cellular populations such as the immune system. However, these methods only provide short-read sequence from one end of a cDNA template, making them poorly suited to the investigation of gene-regulatory events such as mRNA splicing, adaptive immune responses or somatic genome evolution. To address this challenge, we have developed a method that combines targeted long-read sequencing with short-read based transcriptome profiling of barcoded single cell libraries generated by droplet-based partitioning. We use Repertoire And Gene Expression sequencing (RAGE-seq) to accurately characterize full-length T cell (TCR) and B cell (BCR) receptor sequences and transcriptional profiles of more than 7,138 lymphocytes s led from the primary tumour and draining lymph node of a breast cancer patient. With this method we show that somatic mutation, alternate splicing and clonal evolution of T and B lymphocytes can be tracked across these tissue compartments. Our results demonstrate that RAGE-Seq is an accessible and cost-effective method for high-throughput deep single cell profiling, applicable to a wide range of biological challenges.

Publication

Genopo: a nanopore sequencing analysis toolkit for portable Android devices

Publisher: Springer Science and Business Media LLC

Date: 29-09-2020

DOI: 10.1038/S42003-020-01270-Z

Abstract: The advent of portable nanopore sequencing devices has enabled DNA and RNA sequencing to be performed in the field or the clinic. However, advances in in situ genomics require parallel development of portable, offline solutions for the computational analysis of sequencing data. Here we introduce Genopo , a mobile toolkit for nanopore sequencing analysis. Genopo compacts popular bioinformatics tools to an Android application, enabling fully portable computation. To demonstrate its utility for in situ genome analysis, we use Genopo to determine the complete genome sequence of the human coronavirus SARS-CoV-2 in nine patient isolates sequenced on a nanopore device, with Genopo executing this workflow in less than 30 min per s le on a range of popular smartphones. We further show how Genopo can be used to profile DNA methylation in a human genome s le, illustrating a flexible, efficient architecture that is suitable to run many popular bioinformatics tools and accommodate small or large genomes. As the first ever smartphone application for nanopore sequencing analysis, Genopo enables the genomics community to harness this cheap, ubiquitous computational resource.

Publication

Accelerated nanopore basecalling with SLOW5 data format

Publisher: Oxford University Press (OUP)

Date: 30-05-2023

DOI: 10.1093/BIOINFORMATICS/BTAD352

Abstract: Nanopore sequencing is emerging as a key pillar in the genomic technology landscape but computational constraints limiting its scalability remain to be overcome. The translation of raw current signal data into DNA or RNA sequence reads, known as ‘basecalling’, is a major friction in any nanopore sequencing workflow. Here, we exploit the advantages of the recently developed signal data format ‘SLOW5’ to streamline and accelerate nanopore basecalling on high-performance computing (HPC) and cloud environments. SLOW5 permits highly efficient sequential data access, eliminating a potential analysis bottleneck. To take advantage of this, we introduce Buttery-eel, an open-source wrapper for Oxford Nanopore’s Guppy basecaller that enables SLOW5 data access, resulting in performance improvements that are essential for scalable, affordable basecalling. Buttery-eel is available at github.com/Psy-Fer/buttery-eel.

Publication

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Publisher: Cold Spring Harbor Laboratory

Date: 22-04-2021

DOI: 10.1101/2021.04.21.440861

Abstract: InterARTIC is an interactive web application for the analysis of viral whole-genome sequencing (WGS) data generated on Oxford Nanopore Technologies (ONT) devices. A graphical interface enables users with no bioinformatics expertise to analyse WGS experiments and reconstruct consensus genome sequences from in idual isolates of viruses, such as SARS-CoV-2. InterARTIC is intended to facilitate widespread adoption and standardisation of ONT sequencing for viral surveillance and molecular epidemiology. We demonstrate the use of InterARTIC for the analysis of ONT viral WGS data from SARS-CoV-2 and Ebola virus, using a laptop computer or the internal computer on an ONT GridION sequencing device. We showcase the intuitive graphical interface, workflow customisation capabilities and job-scheduling system that facilitate execution of small- and large-scale WGS projects on any common virus. InterARTIC is a free, open-source web application implemented in Python. The application can be downloaded as a set of pre-compiled binaries that are compatible with all common Ubuntu distributions, or built from source. For further details please visit: github.com/Psy-Fer/interARTIC/ .

Publication

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

Publisher: Cold Spring Harbor Laboratory

Date: 04-08-2020

DOI: 10.1101/2020.08.04.236893

Abstract: Viral whole-genome sequencing (WGS) provides critical insight into the transmission and evolution of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Long-read sequencing devices from Oxford Nanopore Technologies (ONT) promise significant improvements in turnaround time, portability and cost, compared to established short-read sequencing platforms for viral WGS (e.g., Illumina). However, adoption of ONT sequencing for SARS-CoV-2 surveillance has been limited due to common concerns around sequencing accuracy. To address this, we performed viral WGS with ONT and Illumina platforms on 157 matched SARS-CoV-2-positive patient specimens and synthetic RNA controls, enabling rigorous evaluation of analytical performance. Despite the elevated error rates observed in ONT sequencing reads, highly accurate consensus-level sequence determination was achieved, with single nucleotide variants (SNVs) detected at % sensitivity and % precision above a minimum ~ 60-fold coverage depth, thereby ensuring suitability for SARS-CoV-2 genome analysis. ONT sequencing also identified a surprising ersity of structural variation within SARS-CoV-2 specimens that were supported by evidence from short-read sequencing on matched s les. However, ONT sequencing failed to accurately detect short indels and variants at low read-count frequencies. This systematic evaluation of analytical performance for SARS-CoV-2 WGS will facilitate widespread adoption of ONT sequencing within local, national and international COVID-19 public health initiatives.

Publication

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Publisher: Springer Science and Business Media LLC

Date: 17-02-2019

DOI: 10.1038/S41467-019-11049-4

Abstract: High-throughput single-cell RNA sequencing is a powerful technique but only generates short reads from one end of a cDNA template, limiting the reconstruction of highly erse sequences such as antigen receptors. To overcome this limitation, we combined targeted capture and long-read sequencing of T-cell-receptor (TCR) and B-cell-receptor (BCR) mRNA transcripts with short-read transcriptome profiling of barcoded single-cell libraries generated by droplet-based partitioning. We show that Repertoire and Gene Expression by Sequencing (RAGE-Seq) can generate accurate full-length antigen receptor sequences at nucleotide resolution, infer B-cell clonal evolution and identify alternatively spliced BCR transcripts. We apply RAGE-Seq to 7138 cells s led from the primary tumor and draining lymph node of a breast cancer patient to track transcriptome profiles of expanded lymphocyte clones across tissues. Our results demonstrate that RAGE-Seq is a powerful method for tracking the clonal evolution from large numbers of lymphocytes applicable to the study of immunity, autoimmunity and cancer.

Publication

Accelerated nanopore basecalling with SLOW5 data format

Publisher: Cold Spring Harbor Laboratory

Date: 07-02-2023

DOI: 10.1101/2023.02.06.527365

Abstract: Nanopore sequencing is emerging as a key pillar in the genomic technology landscape but computational constraints limiting its scalability remain to be overcome. The translation of raw current signal data into DNA or RNA sequence reads, known as ‘basecalling’, is a major friction in any nanopore sequencing workflow. Here, we exploit the advantages of the recently developed signal data format ‘SLOW5’ to streamline and accelerate nanopore basecalling on high-performance computer (HPC) and cloud environments. SLOW5 permits highly efficient sequential data access, eliminating a significant analysis bottleneck. To take advantage of this, we introduce Buttery-eel , an open-source wrapper for Oxford Nanopore’s Guppy basecaller that enables SLOW5 data access, resulting in performance improvements that are essential for scalable, affordable basecalling.

Publication

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

Publisher: Springer Science and Business Media LLC

Date: 09-12-2020

DOI: 10.1038/S41467-020-20075-6

Abstract: Viral whole-genome sequencing (WGS) provides critical insight into the transmission and evolution of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Long-read sequencing devices from Oxford Nanopore Technologies (ONT) promise significant improvements in turnaround time, portability and cost, compared to established short-read sequencing platforms for viral WGS (e.g., Illumina). However, adoption of ONT sequencing for SARS-CoV-2 surveillance has been limited due to common concerns around sequencing accuracy. To address this, here we perform viral WGS with ONT and Illumina platforms on 157 matched SARS-CoV-2-positive patient specimens and synthetic RNA controls, enabling rigorous evaluation of analytical performance. We report that, despite the elevated error rates observed in ONT sequencing reads, highly accurate consensus-level sequence determination was achieved, with single nucleotide variants (SNVs) detected at % sensitivity and % precision above a minimum ~60-fold coverage depth, thereby ensuring suitability for SARS-CoV-2 genome analysis. ONT sequencing also identified a surprising ersity of structural variation within SARS-CoV-2 specimens that were supported by evidence from short-read sequencing on matched s les. However, ONT sequencing failed to accurately detect short indels and variants at low read-count frequencies. This systematic evaluation of analytical performance for SARS-CoV-2 WGS will facilitate widespread adoption of ONT sequencing within local, national and international COVID-19 public health initiatives.

Publication

Evaluation of antioxidant compounds, antioxidant activities and capsaicinoid compounds of Chili (Capsicum sp.) germplasms available in Malaysia

Publisher: Elsevier BV

Date: 05-2018

DOI: 10.1016/J.JARMAP.2018.02.001

Publication

Flexible and efficient handling of nanopore sequencing signal data with slow5tools

Publisher: Cold Spring Harbor Laboratory

Date: 20-06-2022

DOI: 10.1101/2022.06.19.496732

Abstract: Nanopore sequencing is an emerging technology that is being rapidly adopted in research and clinical genomics. We recently developed SLOW5, a new file format for storage and analysis of raw data from nanopore sequencing experiments. SLOW5 is a community-centric, open source format that offers considerable performance benefits over the existing nanopore data format, known as FAST5. Here we introduce slow5tools , a simple, intuitive toolkit for handling nanopore raw signal data in SLOW5 format. Slow5tools enables lossless FAST5-to-SLOW5 and SLOW5-to-FAST5 data conversion, and a range of tools for structuring, indexing, viewing and querying SLOW5 files. Slow5tools uses multi-threading, multi-processing and other engineering strategies to achieve fast data conversion and manipulation, including live FAST5-to-SLOW5 conversion during sequencing. We outline a series of ex les and benchmarking experiments to illustrate slow5tools usage, and describe the engineering principles underpinning its high performance. Slow5tools is an essential toolkit for handling nanopore signal data, which was developed to support adoption of SLOW5 by the nanopore community. Slow5tools is written in C/C++ with minimal dependencies and is freely available as an open-source program under an MIT licence: asindu2008/slow5tools .

Publication

HIVepsilon-seq—scalable characterization of intact persistent proviral HIV reservoirs in women

Publisher: American Society for Microbiology

Date: 16-10-2023

DOI: 10.1128/JVI.00705-23

Publication

SquiggleKit: A toolkit for manipulating nanopore signal data

Publisher: Cold Spring Harbor Laboratory

Date: 16-02-2019

DOI: 10.1101/549741

Abstract: The management of raw nanopore sequencing data poses a challenge that must be overcome to accelerate the development of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualisation, and signal processing. Its modular tools can be used to reduce file numbers and memory footprint, identify poly-A tails, target barcodes, adapters, and find nucleotide sequence motifs in raw nanopore signal, amongst other applications. SquiggleKit serves as a bioinformatics portal into signal space, for novice and experienced users alike. It is comprehensively documented, simple to use, cross-platform compatible and freely available from ( github.com/Psy-Fer/SquiggleKit ).

Publication

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

Publisher: Cold Spring Harbor Laboratory

Date: 10-2021

DOI: 10.1101/2021.09.27.21263187

Abstract: Short-tandem repeat (STR) expansions are an important class of pathogenic genetic variants. Over forty neurological and neuromuscular diseases are caused by STR expansions, with 37 different genes implicated to date. Here we describe the use of programmable targeted long-read sequencing with Oxford Nanopore’s ReadUntil function for parallel genotyping of all known neuropathogenic STRs in a single, simple assay. Our approach enables accurate, haplotype-resolved assembly and DNA methylation profiling of expanded and non-expanded STR sites. In doing so, the assay correctly diagnoses all in iduals in a cohort of patients ( n = 27) with various neurogenetic diseases, including Huntington’s disease, fragile X syndrome and cerebellar ataxia (CANVAS) and others. Targeted long-read sequencing solves large and complex STR expansions that confound established molecular tests and short-read sequencing, and identifies non-canonical STR motif conformations and internal sequence interruptions. Even in our relatively small cohort, we observe a wide ersity of STR alleles of known and unknown pathogenicity, suggesting that long-read sequencing will redefine the genetic landscape of STR expansion disorders. Finally, we show how the flexible inclusion of pharmacogenomics (PGx) genes as secondary ReadUntil targets can identify clinically actionable PGx genotypes to further inform patient care, at no extra cost. Our study addresses the need for improved techniques for genetic diagnosis of STR expansion disorders and illustrates the broad utility of programmable long-read sequencing for clinical genomics. This study describes the development and validation of a programmable targeted nanopore sequencing assay for parallel genetic diagnosis of all known pathogenic short-tandem repeats (STRs) in a single, simple test.

Publication

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

Publisher: American Association for the Advancement of Science (AAAS)

Date: 04-03-2022

DOI: 10.1126/SCIADV.ABM5386

Abstract: More than 50 neurological and neuromuscular diseases are caused by short tandem repeat (STR) expansions, with 37 different genes implicated to date. We describe the use of programmable targeted long-read sequencing with Oxford Nanopore’s ReadUntil function for parallel genotyping of all known neuropathogenic STRs in a single assay. Our approach enables accurate, haplotype-resolved assembly and DNA methylation profiling of STR sites, from a list of predetermined candidates. This correctly diagnoses all in iduals in a small cohort ( n = 37) including patients with various neurogenetic diseases ( n = 25). Targeted long-read sequencing solves large and complex STR expansions that confound established molecular tests and short-read sequencing and identifies noncanonical STR motif conformations and internal sequence interruptions. We observe a ersity of STR alleles of known and unknown pathogenicity, suggesting that long-read sequencing will redefine the genetic landscape of repeat disorders. Last, we show how the inclusion of pharmacogenomic genes as secondary ReadUntil targets can further inform patient care.

Publication

DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity

Publisher: Elsevier BV

Date: 09-2021

DOI: 10.1016/J.CELREP.2021.109722

Abstract: DNA replication timing and three-dimensional (3D) genome organization are associated with distinct epigenome patterns across large domains. However, whether alterations in the epigenome, in particular cancer-related DNA hypomethylation, affects higher-order levels of genome architecture is still unclear. Here, using Repli-Seq, single-cell Repli-Seq, and Hi-C, we show that genome-wide methylation loss is associated with both concordant loss of replication timing precision and deregulation of 3D genome organization. Notably, we find distinct disruption in 3D genome compartmentalization, striking gains in cell-to-cell replication timing heterogeneity and loss of allelic replication timing in cancer hypomethylation models, potentially through the gene deregulation of DNA replication and genome organization pathways. Finally, we identify ectopic H3K4me3-H3K9me3 domains from across large hypomethylated domains, where late replication is maintained, which we purport serves to protect against catastrophic genome reorganization and aberrant gene transcription. Our results highlight a potential role for the methylome in the maintenance of 3D genome regulation.

Publication

Hypoparathyroidism: Genetics and Diagnosis

Publisher: Wiley

Date: 14-11-2022

DOI: 10.1002/JBMR.4667

Abstract: This narrative report summarizes diagnostic criteria for hypoparathyroidism and describes the clinical presentation and underlying genetic causes of the nonsurgical forms. We conducted a comprehensive literature search from January 2000 to January 2021 and included landmark articles before 2000, presenting a comprehensive update of these topics and suggesting a research agenda to improve diagnosis and, eventually, the prognosis of the disease. Hypoparathyroidism, which is characterized by insufficient secretion of parathyroid hormone (PTH) leading to hypocalcemia, is diagnosed on biochemical grounds. Low albumin‐adjusted calcium or ionized calcium with concurrent inappropriately low serum PTH concentration are the hallmarks of the disease. In this review, we discuss the characteristics and pitfalls in measuring calcium and PTH. We also undertook a systematic review addressing the utility of measuring calcium and PTH within 24 hours after total thyroidectomy to predict long‐term hypoparathyroidism. A summary of the findings is presented here results of the detailed systematic review are published separately in this issue of JBMR . Several genetic disorders can present with hypoparathyroidism, either as an isolated disease or as part of a syndrome. A positive family history and, in the case of complex diseases, characteristic comorbidities raise the clinical suspicion of a genetic disorder. In addition to these disorders' phenotypic characteristics, which include autoimmune diseases, we discuss approaches for the genetic diagnosis. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).

Publication

Squigulator: simulation of nanopore sequencing signal data with tunable noise parameters

Publisher: Cold Spring Harbor Laboratory

Date: 10-05-2023

DOI: 10.1101/2023.05.09.539953

Abstract: In silico simulation of next-generation sequencing data is a technique used widely in the genomics field. However, there is currently a lack of optimal tools for creating simulated data from ‘third-generation’ nanopore sequencing devices, which measure DNA or RNA molecules in the form of time-series current signal data. Here, we introduce Squigulator , a fast and simple tool for simulation of realistic nanopore signal data. Squigulator takes a reference genome, transcriptome or read sequences and generates corresponding raw nanopore signal data. This is compatible with basecalling software from Oxford Nanopore Technologies (ONT) and other third-party tools, thereby providing a useful substrate for testing, debugging, validation and optimisation of nanopore analysis methods. The user may generate noise-free ‘ideal’ data, realistic data with noise profiles emulating specific ONT protocols, or they may deterministically modify noise parameters and other variables to shape the data to their needs. To highlight its utility, we use Squigulator to model the degree to which different types of noise impact the accuracy of ONT basecalling and downstream variant detection, revealing new insights into the properties of ONT data. We provide Squigulator as an open-source tool for the nanopore community: asindu2008/squigulator

Publication

SLOW5: a new file format enables massive acceleration of nanopore sequencing data analysis

Publisher: Research Square Platform LLC

Date: 13-07-2021

DOI: 10.21203/RS.3.RS-668517/V1

Abstract: Nanopore sequencing is an emerging genomic technology with great potential. However, the storage and analysis of nanopore sequencing data have become major bottlenecks preventing more widespread adoption in research and clinical genomics. Here, we elucidate an inherent limitation in the file format used to store raw nanopore data – known as FAST5 – that prevents efficient analysis on high-performance computing (HPC) systems. To overcome this we have developed SLOW5, an alternative file format that permits efficient parallelisation and, thereby, acceleration of nanopore data analysis. For ex le, we show that using SLOW5 format, instead of FAST5, reduces the time and cost of genome-wide DNA methylation profiling by an order of magnitude on common HPC systems, and delivers consistent improvements on a wide range of different architectures. With a simple, accessible file structure and a ~25% reduction in size compared to FAST5, SLOW5 format will deliver substantial benefits to all areas of the nanopore community.

James M. Ferguson

Researcher

Publications

Fast nanopore sequencing data analysis with SLOW5

SquiggleKit: a toolkit for manipulating nanopore signal data

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Barcoding and demultiplexing Oxford Nanopore native RNA sequencing reads with deep residual learning

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Genopo: a nanopore sequencing analysis toolkit for portable Android devices

Accelerated nanopore basecalling with SLOW5 data format

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Accelerated nanopore basecalling with SLOW5 data format

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

Evaluation of antioxidant compounds, antioxidant activities and capsaicinoid compounds of Chili (Capsicum sp.) germplasms available in Malaysia

Flexible and efficient handling of nanopore sequencing signal data with slow5tools

HIVepsilon-seq—scalable characterization of intact persistent proviral HIV reservoirs in women

SquiggleKit: A toolkit for manipulating nanopore signal data

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity

Hypoparathyroidism: Genetics and Diagnosis

Squigulator: simulation of nanopore sequencing signal data with tunable noise parameters

SLOW5: a new file format enables massive acceleration of nanopore sequencing data analysis

Related Organisations

Universiti Putra Malaysia

Universiti Sultan Zainal Abidin - Kampus Besut

Bangladesh Agricultural University

Patuakhali Science And Technology University

Universiti Malaysia Sabah

Baruna Bazar P. D. C. High School

Cantonment College

Garvan Institute Of Medical Research

Related Funding Activities

James M. Ferguson

Researcher

Related Links

Publications

Fast nanopore sequencing data analysis with SLOW5

SquiggleKit: a toolkit for manipulating nanopore signal data

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Barcoding and demultiplexing Oxford Nanopore native RNA sequencing reads with deep residual learning

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Genopo: a nanopore sequencing analysis toolkit for portable Android devices

Accelerated nanopore basecalling with SLOW5 data format

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes

Accelerated nanopore basecalling with SLOW5 data format

Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis

Evaluation of antioxidant compounds, antioxidant activities and capsaicinoid compounds of Chili (Capsicum sp.) germplasms available in Malaysia

Flexible and efficient handling of nanopore sequencing signal data with slow5tools

HIVepsilon-seq—scalable characterization of intact persistent proviral HIV reservoirs in women

SquiggleKit: A toolkit for manipulating nanopore signal data

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity

Hypoparathyroidism: Genetics and Diagnosis

Squigulator: simulation of nanopore sequencing signal data with tunable noise parameters

SLOW5: a new file format enables massive acceleration of nanopore sequencing data analysis

Related Organisations

Universiti Putra Malaysia

Universiti Sultan Zainal Abidin - Kampus Besut

Bangladesh Agricultural University

Patuakhali Science And Technology University

Universiti Malaysia Sabah

Baruna Bazar P. D. C. High School

Cantonment College

Garvan Institute Of Medical Research

Related Funding Activities

ARDC NEWSLETTER SIGNUP