ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-2482-5336

Current Organisation
James Cook University

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data

Publisher: Oxford University Press (OUP)

Date: 21-08-2022

DOI: 10.1093/BIB/BBAC343

Abstract: Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target not only microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared with traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data s ling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data s ling methods the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data s ling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at BioGenies.info/AMPBenchmark.

Publication

Shotgun Proteomics Analysis of Saliva and Salivary Gland Tissue from the Common Octopus Octopus vulgaris

Publisher: American Chemical Society (ACS)

Date: 16-09-2018

DOI: 10.1021/ACS.JPROTEOME.8B00525

Abstract: The salivary apparatus of the common octopus ( Octopus vulgaris) has been the subject of biochemical study for over a century. A combination of bioassays, behavioral studies and molecular analysis on O. vulgaris and related species suggests that its proteome should contain a mixture of highly potent neurotoxins and degradative proteins. However, a lack of genomic and transcriptomic data has meant that the amino acid sequences of these proteins remain almost entirely unknown. To address this, we assembled the posterior salivary gland transcriptome of O. vulgaris and combined it with high resolution mass spectrometry data from the posterior and anterior salivary glands of two adults, the posterior salivary glands of six paralarvae and the saliva from a single adult. We identified a total of 2810 protein groups from across this range of salivary tissues and age classes, including 84 with homology to known venom protein families. Additionally, we found 21 short secreted cysteine rich protein groups of which 12 were specific to cephalopods. By combining protein expression data with phylogenetic analysis we demonstrate that serine proteases expanded dramatically within the cephalopod lineage and that cephalopod specific proteins are strongly associated with the salivary apparatus.

Publication

ampir: an R package for fast genome-wide prediction of antimicrobial peptides

Publisher: Oxford University Press (OUP)

Date: 19-07-2020

DOI: 10.1093/BIOINFORMATICS/BTAA653

Abstract: Antimicrobial peptides (AMPs) are the key components of the innate immune system that protect against pathogens, regulate the microbiome and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ir. ir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. ir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at egana/ ir. Supplementary data are available at Bioinformatics online.

Publication

ampir: an R package for fast genome-wide prediction of antimicrobial peptides

Publisher: Cold Spring Harbor Laboratory

Date: 08-05-2020

DOI: 10.1101/2020.05.07.082412

Abstract: Antimicrobial peptides (AMPs) are key components of the innate immune system that protect against pathogens, regulate the microbiome, and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ir. ir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. ir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at egana/ ir legana.fingerhut@my.jcu.edu.au ira.cooke@jcu.edu.au Supplementary data are available at egana/ _pub

Publication

Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data

Publisher: Cold Spring Harbor Laboratory

Date: 30-05-2022

DOI: 10.1101/2022.05.30.493946

Abstract: Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared to traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data s ling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data s ling methods the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data s ling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at BioGenies.info/AMPBenchmark .

Related Organisations

Organisation

James Cook University

Location: Australia

View Organisation

Related Funding Activities

No related grants have been discovered for Legana Fingerhut.

Legana Fingerhut

Researcher

Related Links

Publications

Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data

Shotgun Proteomics Analysis of Saliva and Salivary Gland Tissue from the Common Octopus Octopus vulgaris

ampir: an R package for fast genome-wide prediction of antimicrobial peptides

ampir: an R package for fast genome-wide prediction of antimicrobial peptides

Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data

Related Organisations

James Cook University

Related Funding Activities

ARDC NEWSLETTER SIGNUP