ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-3480-3819

Current Organisations
University of Oxford , University of Manchester

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

The Utility of Data Transformation for Alignment, De Novo Assembly and Classification of Short Read Virus Sequences

Publisher: MDPI AG

Date: 26-04-2019

DOI: 10.3390/V11050394

Abstract: Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work, we explored the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Leveraging highly compressed sequence transformations to accelerate sequence comparison, our approach yielded comparable accuracy to existing approaches, further demonstrating its suitability for sequences originating from erse virus populations. We assessed the application of our methodology using both synthetic and real viral pathogen sequences. Our results show that the use of highly compressed sequence approximations can provide accurate results, with analytical performance retained and even enhanced through appropriate dimensionality reduction of sequence data.

Publication

Detecting macroecological patterns in bacterial communities across independent studies of global soils

Publisher: Springer Science and Business Media LLC

Date: 20-11-2017

DOI: 10.1038/S41564-017-0062-X

Abstract: The emergence of high-throughput DNA sequencing methods provides unprecedented opportunities to further unravel bacterial bio ersity and its worldwide role from human health to ecosystem functioning. However, despite the abundance of sequencing studies, combining data from multiple in idual studies to address macroecological questions of bacterial ersity remains methodically challenging and plagued with biases. Here, using a machine-learning approach that accounts for differences among studies and complex interactions among taxa, we merge 30 independent bacterial data sets comprising 1,998 soil s les from 21 countries. Whereas previous meta-analysis efforts have focused on bacterial ersity measures or abundances of major taxa, we show that disparate licon sequence data can be combined at the taxonomy-based level to assess bacterial community structure. We find that rarer taxa are more important for structuring soil communities than abundant taxa, and that these rarer taxa are better predictors of community structure than environmental factors, which are often confounded across studies. We conclude that combining data from independent studies can be used to explore bacterial community dynamics, identify potential 'indicator' taxa with an important role in structuring communities, and propose hypotheses on the factors that shape bacterial biogeography that have been overlooked in the past.

Publication

The Utility of Data Transformation for Alignment, <em>de novo</em> Assembly and Classification of Short Read Virus Sequences

Publisher: MDPI AG

Date: 04-2019

DOI: 10.20944/PREPRINTS201904.0014.V1

Abstract: Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work we explore the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Despite using highly compressed sequence transformations to accelerate the processes, our sequence processing approach yielded comparable accuracy to existing approaches, and are ideally suited for sequences originating from highly erse virus populations. We demonstrate the application of our methodology to both synthetic and real viral pathogen sequence data. Our results show that the use of highly compressed sequence approximations can provide accurate results and that useful analytical performance can be retained and even enhanced through appropriate dimensionality reduction of sequence data.

Publication

The khmer software package: enabling efficient nucleotide sequence analysis

Publisher: F1000 Research Ltd

Date: 25-09-2015

DOI: 10.12688/F1000RESEARCH.6924.1

Abstract: The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at ib-lab/khmer/ .

Related Organisations

Organisation

University Of Oxford

Location: United Kingdom of Great Britain and Northern Ireland

View Organisation

Organisation

University Of Manchester

Location: United Kingdom of Great Britain and Northern Ireland

View Organisation

Organisation

University Of Leeds

Location: United Kingdom of Great Britain and Northern Ireland

View Organisation

Related Funding Activities

No related grants have been discovered for Bede Constantinides.

Bede Constantinides

Researcher

Related Links

Publications

The Utility of Data Transformation for Alignment, De Novo Assembly and Classification of Short Read Virus Sequences

Detecting macroecological patterns in bacterial communities across independent studies of global soils

The Utility of Data Transformation for Alignment, <em>de novo</em> Assembly and Classification of Short Read Virus Sequences

The khmer software package: enabling efficient nucleotide sequence analysis

Related Organisations

University Of Oxford

University Of Manchester

University Of Leeds

Related Funding Activities

ARDC NEWSLETTER SIGNUP