ORCID Profile
0000-0003-1315-4896
Current Organisations
Macquarie University
,
Fundação Oswaldo Cruz
,
CSIRO
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: American Chemical Society (ACS)
Date: 25-05-2022
DOI: 10.1021/ACS.JPROTEOME.1C00968
Abstract: Alternative splicing can lead to distinct protein isoforms. These can have different functions in specific cells and tissues or in different developmental stages. In this study, we explored whether transcripts assembled from long read, nanopore-based, direct RNA-sequencing (RNA-seq) could improve the identification of protein isoforms in human K562 cells. By comparing with Illumina-based short read RNA-seq, we showed that a large proportion of Ensembl transcripts (5949/14,326) and genes expressing alternatively spliced transcripts (486/2981) identified with long direct reads were missed by short paired-end reads. By co-analyzing proteomic and transcriptomic data, we also showed that some peptides (826/35,976), proteins (262/3215), and protein isoforms arising from distinct transcript variants (574/1212) identified with isoform-specific peptides via custom long-read-based databases were missed in Illumina-derived databases. Finally, we generated unequivocal peptide evidence for a set of protein isoforms and showed that long read, direct RNA-seq allows the discovery of novel protein isoforms not already in reference databases or custom databases built from short read RNA-seq data. Our analysis highlights the benefits of long read RNA-seq data in the generation of reference databases to increase tandem mass spectrometry (MS/MS) identification of protein isoforms.
Publisher: American Chemical Society (ACS)
Date: 10-11-2015
DOI: 10.1021/ACS.JPROTEOME.5B00734
Abstract: In recent years, proteomic data have contributed to genome annotation efforts, most notably in humans and mice, and spawned a field termed "proteogenomics". Yeast, in contrast with higher eukaryotes, has a small genome, which has lent itself to simpler ORF prediction. Despite this, continual advances in mass spectrometry suggest that proteomics should be able to improve genome annotation even in this well-characterized species. Here we applied a proteogenomics workflow to yeast to identify novel protein-coding genes. Specific databases were generated, from intergenic regions of the genome, which were then queried with MS/MS data. This suggested the existence of several putative novel ORFs of <100 codons, one of which we chose to validate. Synthetic peptides, RNA-Seq analysis, and evidence of evolutionary conservation allowed for the unequivocal definition of a new protein of 78 amino acids encoded on chromosome X, which we dub YJR107C-A. It encodes a new type of domain, which ab initio modeling suggests as predominantly α-helical. We show that this gene is nonessential for growth however, deletion increases sensitivity to osmotic stress. Finally, from the above discovery process, we discuss a generalizable strategy for the identification of short ORFs and small proteins, many of which are likely to be undiscovered.
Publisher: Wiley
Date: 08-08-2019
Abstract: High-resolution MS/MS spectra of peptides can be deisotoped to identify monoisotopic masses of peptide fragments. The use of such masses should improve protein identification rates. However, deisotoping is not universally used and its benefits have not been fully explored. Here, MS2-Deisotoper, a tool for use prior to database search, is used to identify monoisotopic peaks in centroided MS/MS spectra. MS2-Deisotoper works by comparing the mass and relative intensity of each peptide fragment peak to every other peak of greater mass, and by applying a set of rules concerning mass and intensity differences. After comprehensive parameter optimization, it is shown that MS2-Deisotoper can improve the number of peptide spectrum matches (PSMs) identified by up to 8.2% and proteins by up to 2.8%. It is effective with SILAC and non-SILAC MS/MS data. The identification of unique peptide sequences is also improved, increasing the number of human proteoforms by 3.7%. Detailed investigation of results shows that deisotoping increases Mascot ion scores, improves FDR estimation for PSMs, and leads to greater protein sequence coverage. At a peptide level, it is found that the efficacy of deisotoping is affected by peptide mass and charge. MS2-Deisotoper can be used via a user interface or as a command-line tool.
Publisher: Mary Ann Liebert Inc
Date: 04-2021
Publisher: Public Library of Science (PLoS)
Date: 27-07-2020
Publisher: American Chemical Society (ACS)
Date: 03-11-2018
DOI: 10.1021/ACS.JPROTEOME.7B00601
Abstract: The study of post-translational methylation is h ered by the fact that large-scale LC-MS/MS experiments produce high methylpeptide false discovery rates (FDRs). The use of heavy-methyl stable isotope labeling by amino acids in cell culture (heavy-methyl SILAC) can drastically reduce these FDRs however, this approach is limited by a lack of heavy-methyl SILAC compatible software. To fill this gap, we recently developed MethylQuant. Here, using an updated version of MethylQuant, we demonstrate its methylpeptide validation and quantification capabilities and provide guidelines for its best use. Using reference heavy-methyl SILAC data sets, we show that MethylQuant predicts with statistical significance the true or false positive status of methylpeptides in s les of varying complexity, degree of methylpeptide enrichment, and heavy to light mixing ratios. We introduce methylpeptide confidence indicators, MethylQuant Confidence and MethylQuant Score, and demonstrate their strong performance in complex s les characterized by a lack of methylpeptide enrichment. For these challenging data sets, MethylQuant identifies 882 of 1165 true positive methylpeptide spectrum matches (i.e., >75% sensitivity) at high specificity (<2% FDR) and achieves near-perfect specificity at 41% sensitivity. We also demonstrate that MethylQuant produces high accuracy relative quantification data that are tolerant of interference from coeluting peptide ions. Together MethylQuant's capabilities provide a path toward routine, accurate characterizations of the methylproteome using heavy-methyl SILAC.
Publisher: American Chemical Society (ACS)
Date: 20-05-2015
DOI: 10.1021/PR5011394
Abstract: Human proteome analysis now requires an understanding of protein isoforms. We recently published the PG Nexus pipeline, which facilitates high confidence validation of exons and splice junctions by integrating genomics and proteomics data. Here we comprehensively explore how RNA-seq transcriptomics data, and proteomic analysis of the same s le, can identify protein isoforms. RNA-seq data from human mesenchymal (hMSC) stem cells were analyzed with our new TranscriptCoder tool to generate a database of protein isoform sequences. MS/MS data from matching hMSC s les were then matched against the TranscriptCoder-derived database, along with Ensembl and the neXtProt database. Querying the TranscriptCoder-derived or Ensembl database could unambiguously identify ∼450 protein isoforms, with isoform-specific proteotypic peptides, including candidate hMSC-specific isoforms for the genes DPYSL2 and FXR1. Where isoform-specific peptides did not exist, groups of nonisoform-specific proteotypic peptides could specifically identify many isoforms. In both the above cases, isoforms will be detectable with targeted MS/MS assays. Unfortunately, our analysis also revealed that some isoforms will be difficult to identify unambiguously as they do not have peptides that are sufficiently distinguishing. We covisualize mRNA isoforms and peptides in a genome browser to illustrate the above situations. Mass spectrometry data is available via ProteomeXchange (PXD001449).
Publisher: Wiley
Date: 17-01-2019
DOI: 10.1002/CPBI.71
Abstract: Post‐translational modifications (PTMs) of proteins act as key regulators of protein activity, including the regulation of protein‐protein interactions (PPIs). However, exploring functional links between PTMs and PPIs can be difficult. PTMOracle is a Cytoscape app that facilitates the co‐visualization and co‐analysis of PTMs in the context of PPI networks. PTMOracle also allows extensive data to be integrated and co‐analyzed, allowing the role of domains, motifs, and disordered regions to be considered. Here, we describe several PTMOracle protocols investigating complex PTM‐associated relationships and their role in PPIs. This is assisted by OraclePainter for coloring proteins by the modifications present and visualizing these in the context of networks, by OracleTools for cross‐matching PTMs with sequence feature for all nodes in the network, and by OracleResults for exploring specific proteins and visualizing their PTMs in the context of protein sequences. This unit aims to demonstrate how PTMOracle can be used to systematically explore network visualizations and generate testable hypotheses regarding the functional role of PTMs in PPIs, and how the results can be analyzed to better understand the regulatory role of PTMs in PPIs. © 2019 by John Wiley & Sons, Inc.
Publisher: Elsevier BV
Date: 03-2016
Publisher: Hindawi Limited
Date: 25-05-2020
DOI: 10.1111/TBED.13588
Publisher: Springer Science and Business Media LLC
Date: 28-06-2021
Publisher: American Chemical Society (ACS)
Date: 12-11-2014
DOI: 10.1021/PR400820P
Abstract: Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of C ylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other ex les of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.
Publisher: American Chemical Society (ACS)
Date: 06-04-2017
DOI: 10.1021/ACS.JPROTEOME.6B01052
Abstract: Post-translational modifications of proteins (PTMs) act as key regulators of protein activity and of protein-protein interactions (PPIs). To date, it has been difficult to comprehensively explore functional links between PTMs and PPIs. To address this, we developed PTMOracle, a Cytoscape app for coanalyzing PTMs within PPI networks. PTMOracle also allows extensive data to be integrated and coanalyzed with PPI networks, allowing the role of domains, motifs, and disordered regions to be considered. For proteins of interest, or a whole proteome, PTMOracle can generate network visualizations to reveal complex PTM-associated relationships. This is assisted by OraclePainter for coloring proteins by modifications, OracleTools for network analytics, and OracleResults for exploring tabulated findings. To illustrate the use of PTMOracle, we investigate PTM-associated relationships and their role in PPIs in four case studies. In the yeast interactome and its rich set of PTMs, we construct and explore histone-associated and domain-domain interaction networks and show how integrative approaches can predict kinases involved in phosphodegrons. In the human interactome, a phosphotyrosine-associated network is analyzed but highlights the sparse nature of human PPI networks and lack of PTM-associated data. PTMOracle is open source and available at the Cytoscape app store: pps tmoracle .
Publisher: Springer Science and Business Media LLC
Date: 17-06-2021
DOI: 10.1038/S41587-021-00936-1
Abstract: Existing compendia of non-coding RNA (ncRNA) are incomplete, in part because they are derived almost exclusively from small and polyadenylated RNAs. Here we present a more comprehensive atlas of the human transcriptome, which includes small and polyA RNA as well as total RNA from 300 human tissues and cell lines. We report thousands of previously uncharacterized RNAs, increasing the number of documented ncRNAs by approximately 8%. To infer functional regulation by known and newly characterized ncRNAs, we exploited pre-mRNA abundance estimates from total RNA sequencing, revealing 316 microRNAs and 3,310 long non-coding RNAs with multiple lines of evidence for roles in regulating protein-coding genes and pathways. Our study both refines and expands the current catalog of human ncRNAs and their regulatory interactions. All data, analyses and results are available for download and interrogation in the R2 web portal, serving as a basis for future exploration of RNA biology and function.
Publisher: American Society for Microbiology
Date: 28-02-2013
Abstract: C ylobacter showae UNSWCD was isolated from a patient with Crohn's disease. Here we present a 2.1 Mb draft assembly of its genome.
Location: Australia
No related grants have been discovered for Aidan P. Tay.