ORCID Profile
0000-0002-2482-5336
Current Organisation
James Cook University
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: Oxford University Press (OUP)
Date: 21-08-2022
DOI: 10.1093/BIB/BBAC343
Abstract: Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target not only microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared with traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data s ling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data s ling methods the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data s ling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at BioGenies.info/AMPBenchmark.
Publisher: American Chemical Society (ACS)
Date: 16-09-2018
DOI: 10.1021/ACS.JPROTEOME.8B00525
Abstract: The salivary apparatus of the common octopus ( Octopus vulgaris) has been the subject of biochemical study for over a century. A combination of bioassays, behavioral studies and molecular analysis on O. vulgaris and related species suggests that its proteome should contain a mixture of highly potent neurotoxins and degradative proteins. However, a lack of genomic and transcriptomic data has meant that the amino acid sequences of these proteins remain almost entirely unknown. To address this, we assembled the posterior salivary gland transcriptome of O. vulgaris and combined it with high resolution mass spectrometry data from the posterior and anterior salivary glands of two adults, the posterior salivary glands of six paralarvae and the saliva from a single adult. We identified a total of 2810 protein groups from across this range of salivary tissues and age classes, including 84 with homology to known venom protein families. Additionally, we found 21 short secreted cysteine rich protein groups of which 12 were specific to cephalopods. By combining protein expression data with phylogenetic analysis we demonstrate that serine proteases expanded dramatically within the cephalopod lineage and that cephalopod specific proteins are strongly associated with the salivary apparatus.
Publisher: Oxford University Press (OUP)
Date: 19-07-2020
DOI: 10.1093/BIOINFORMATICS/BTAA653
Abstract: Antimicrobial peptides (AMPs) are the key components of the innate immune system that protect against pathogens, regulate the microbiome and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ir. ir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. ir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at egana/ ir. Supplementary data are available at Bioinformatics online.
Publisher: Cold Spring Harbor Laboratory
Date: 08-05-2020
DOI: 10.1101/2020.05.07.082412
Abstract: Antimicrobial peptides (AMPs) are key components of the innate immune system that protect against pathogens, regulate the microbiome, and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ir. ir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. ir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at egana/ ir legana.fingerhut@my.jcu.edu.au ira.cooke@jcu.edu.au Supplementary data are available at egana/ _pub
Publisher: Cold Spring Harbor Laboratory
Date: 30-05-2022
DOI: 10.1101/2022.05.30.493946
Abstract: Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared to traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data s ling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data s ling methods the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data s ling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at BioGenies.info/AMPBenchmark .
No related grants have been discovered for Legana Fingerhut.