ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-9207-0385

Current Organisations
Beijing Institute of Technology , Monash University , Alfred Health

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

A simple, scalable approach to building a cross-platform transcriptome atlas

Publisher: Cold Spring Harbor Laboratory

Date: 11-03-2020

DOI: 10.1101/2020.03.09.984468

Abstract: Gene expression atlases have transformed our understanding of the development, composition and function of human tissues. New technologies promise improved cellular or molecular resolution, and have led to the identification of new cell types, or better defined cell states. But as new technologies emerge, information derived on old platforms becomes obsolete. We demonstrate that it is possible to combine a large number of different profiling experiments summarised from dozens of laboratories and representing hundreds of donors, to create an integrated molecular map of human tissue. As an ex le, we combine 850 s les from 38 platforms to build an integrated atlas of human blood cells. We achieve robust and unbiased cell type clustering using a variance partitioning method, selecting genes with low platform bias relative to biological variation. Other than an initial rescaling, no other transformation to the primary data is applied through batch correction or renormalisation. Additional data, including single-cell datasets, can be projected for comparison, classification and annotation. The resulting atlas provides a multi-scaled approach to visualise and analyse the relationships between sets of genes and blood cell lineages, including the maturation and activation of leukocytes in vivo and in vitro. In allowing for data integration across hundreds of studies, we address a key reproduciblity challenge which is faced by any new technology. This allows us to draw on the deep phenotypes and functional annotations that accompany traditional profiling methods, and provide important context to the high cellular resolution of single cell profiling. Here, we have implemented the blood atlas in the open access Stemformatics.org platform, drawing on its extensive collection of curated transcriptome data. The method is simple, scalable and amenable for rapid deployment in other biological systems or computational workflows. Recursive approach to generating a multi-scaled atlas. Top panel: The method integrates data from all cell types in the Stemformatics database, and shows clear ision of s les into global categories of stromal, pluripotent or blood (inset) cell types. Bottom panel: Integration of only the blood cell subsets provides a blood atlas. Projection of external s les (green) onto the blood atlas. S les are coloured by curated annotations derived from the original studies, and can be viewed at Stemformatics.org

Publication

Integrative computational epigenomics to build data-driven gene regulation hypotheses

Publisher: Oxford University Press (OUP)

Date: 06-2020

DOI: 10.1093/GIGASCIENCE/GIAA064

Abstract: Diseases are complex phenotypes often arising as an emergent property of a non-linear network of genetic and epigenetic interactions. To translate this resulting state into a causal relationship with a subset of regulatory features, many experiments deploy an array of laboratory assays from multiple modalities. Often, each of these resulting datasets is large, heterogeneous, and noisy. Thus, it is non-trivial to unify these complex datasets into an interpretable phenotype. Although recent methods address this problem with varying degrees of success, they are constrained by their scopes or limitations. Therefore, an important gap in the field is the lack of a universal data harmonizer with the capability to arbitrarily integrate multi-modal datasets. In this review, we perform a critical analysis of methods with the explicit aim of harmonizing data, as opposed to case-specific integration. This revealed that matrix factorization, latent variable analysis, and deep learning are potent strategies. Finally, we describe the properties of an ideal universal data harmonization framework. A sufficiently advanced universal harmonizer has major medical implications, such as (i) identifying dysregulated biological pathways responsible for a disease is a powerful diagnostic tool (2) investigating these pathways further allows the biological community to better understand a disease’s mechanisms and (3) precision medicine also benefits from developments in this area, particularly in the context of the growing field of selective epigenome editing, which can suppress or induce a desired phenotype.

Publication

Stemformatics: Easy visualisation platform for well-curated stem cell data

Publisher: F1000 Research Limited

Date: 2018

DOI: 10.7490/F1000RESEARCH.1115766.1

Publication

Stemformatics – visualise and download curated stem cell data

Publisher: F1000 Research Limited

Date: 2018

DOI: 10.7490/F1000RESEARCH.1116311.1

Publication

Tyrone Chen: Data Fluency Digital Toolkit Poster - Minor Prize

Publisher: Monash University

Date: 2019

DOI: 10.26180/5DE98972A2661

Publication

Stemformatics: a visualisation platform for well-curated biological sequence data

Publisher: F1000Research

Date: 2016

DOI: 10.7490/F1000RESEARCH.1113160.1

Publication

Integrated Photo-supercapacitor Based on Bi-polar TiO₂Nanotube Arrays with Selective One-Side Plasma-Assisted Hydrogenation

Publisher: Wiley

Date: 18-11-2013

DOI: 10.1002/ADFM.201303042

Publication

Multi-omics data harmonisation for the discovery of COVID-19 drug targets

Publisher: F1000 Research Limited

Date: 2020

DOI: 10.7490/F1000RESEARCH.1118362.1

Publication

Multi-omics data harmonisation for the discovery of COVID-19 drug targets

Publisher: Zenodo

Date: 2021

DOI: 10.5281/ZENODO.4562010

Publication

An annotation-free format for representing multimodal data features

Publisher: F1000 Research Limited

Date: 2021

DOI: 10.7490/F1000RESEARCH.1118642.1

Publication

genomicBERT and data-free deep-learning model evaluation

Publisher: Cold Spring Harbor Laboratory

Date: 06-2023

DOI: 10.1101/2023.05.31.542682

Abstract: The emerging field of Genome-NLP (Natural Language Processing) aims to analyse biological sequence data using machine learning (ML), offering significant advancements in data-driven diagnostics. Three key challenges exist in Genome-NLP. First, long biomolecular sequences require “tokenisation” into smaller subunits, which is non-trivial since many biological “words” remain unknown. Second, ML methods are highly nuanced, reducing interoperability and usability. Third, comparing models and reproducing results are difficult due to the large volume and poor quality of biological data. To tackle these challenges, we developed the first automated Genome-NLP workflow that integrates feature engineering and ML techniques. The workflow is designed to be species and sequence agnostic. In this workflow: a) We introduce a new transformer-based model for genomes called genomicBERT , which empirically tokenises sequences while retaining biological context. This approach minimises manual preprocessing, reduces vocabulary sizes, and effectively handles out-of-vocabulary “words”. (b) We enable the comparison of ML model performance even in the absence of raw data. To facilitate widespread adoption and collaboration, we have made genomicBERT available as part of the publicly accessible conda package called genomeNLP . We have successfully demonstrated the application of genomeNLP on multiple case studies, showcasing its effectiveness in the field of Genome-NLP. We provide a comprehensive classification of genomic data tokenisation and representation approaches for ML applications along with their pros and cons. We infer k-mers directly from the data and handle out-of-vocabulary words. At the same time, we achieve a significantly reduced vocabulary size compared to the conventional k-mer approach reducing the computational complexity drastically. Our method is agnostic to species or biomolecule type as it is data-driven. We enable comparison of trained model performance without requiring original input data, metadata or hyperparameter settings. We present the first publicly available, high-level toolkit that infers the grammar of genomic data directly through artificial neural networks. Preprocessing, hyperparameter sweeps, cross validations, metrics and interactive visualisations are automated but can be adjusted by the user as needed.

Publication

Bioinformatics training with a little help from my friends

Publisher: Zenodo

Date: 2022

DOI: 10.5281/ZENODO.7232796

Publication

Stemformatics: visualize and download curated stem cell data.

Publisher: Oxford University Press (OUP)

Date: 08-11-2019

DOI: 10.1093/NAR/GKY1064

Publication

multiomics: A user-friendly multi-omics data harmonisation R pipeline

Publisher: F1000 Research Ltd

Date: 06-07-2021

DOI: 10.12688/F1000RESEARCH.53453.1

Abstract: Data from multiple omics layers of a biological system is growing in quantity, heterogeneity and dimensionality. Simultaneous multi-omics data integration is a growing field of research as it has strong potential to unlock information on previously hidden biological relationships leading to early diagnosis, prognosis and expedited treatments. Many tools for multi-omics data integration are being developed. However, these tools are often restricted to highly specific experimental designs, and types of omics data. While some general methods do exist, they require specific data formats and experimental conditions. A major limitation in the field is a lack of a single or multi-omics pipeline which can accept data in an unrefined, information-rich form pre-integration and subsequently generate output for further investigation. There is an increasing demand for a generic multi-omics pipeline to facilitate general-purpose data exploration and analysis of heterogeneous data. Therefore, we present our R multiomics pipeline as an easy to use and flexible pipeline that takes unrefined multi-omics data as input, s le information and user-specified parameters to generate a list of output plots and data tables for quality control and downstream analysis. We have demonstrated application of the pipeline on two separate COVID-19 case studies. We enabled limited checkpointing where intermediate output is staged to allow continuation after errors or interruptions in the pipeline and generate a script for reproducing the analysis to improve reproducibility. A seamless integration with the mixOmics R package is achieved, as the R data object can be loaded and manipulated with mixOmics functions. Our pipeline can be installed as an R package or from the git repository, and is accompanied by detailed documentation with walkthroughs on two case studies. The pipeline is also available as Docker and Singularity containers.

Publication

multiomics: A user-friendly multi-omics data harmonisation R pipeline

Publisher: F1000 Research Ltd

Date: 02-08-2023

DOI: 10.12688/F1000RESEARCH.53453.2

Abstract: Data from multiple omics layers of a biological system is growing in quantity, heterogeneity and dimensionality. Simultaneous multi-omics data integration is of immense interest to researchers as it has potential to unlock previously hidden biomolecular relationships leading to early diagnosis, prognosis, and expedited treatments. Many tools for multi-omics data integration are developed. However, these tools are often restricted to highly specific experimental designs, types of omics data, and specific data formats. A major limitation of the field is the lack of a pipeline that can accept data in unrefined form to preserve maximum biology in an in idual dataset prior to integration. We fill this gap by developing a flexible, generic multi-omics pipeline called multiomics , to facilitate general-purpose data exploration and analysis of heterogeneous data. The pipeline takes unrefined multi-omics data as input, s le information and user-specified parameters to generate a list of output plots and data tables for quality control and downstream analysis. We have demonstrated its application on a sepsis case study. We enabled limited checkpointing functionality where intermediate output is staged to allow continuation after errors or interruptions in the pipeline and generate a script for reproducing the analysis to improve reproducibility. Our pipeline can be installed as an R package or manually from the git repository, and is accompanied by detailed documentation with walkthroughs on three case studies.

Publication

Multipotent RAG1+ progenitors emerge directly from haemogenic endothelium in human pluripotent stem cell-derived haematopoietic organoids.

Publisher: Springer Science and Business Media LLC

Date: 2020

DOI: 10.1038/S41556-019-0445-8

Abstract: Defining the ontogeny of the human adaptive immune system during embryogenesis has implications for understanding childhood diseases including leukaemias and autoimmune conditions. Using RAG1:GFP human pluripotent stem cell reporter lines, we examined human T-cell genesis from pluripotent-stem-cell-derived haematopoietic organoids. Under conditions favouring T-cell development, RAG1+ cells progressively upregulated a cohort of recognized T-cell-associated genes, arresting development at the CD4+CD8+ stage. Sort and re-culture experiments showed that early RAG1+ cells also possessed B-cell, myeloid and erythroid potential. Flow cytometry and single-cell-RNA-sequencing data showed that early RAG1+ cells co-expressed the endothelial/haematopoietic progenitor markers CD34, VECAD and CD90, whereas imaging studies identified RAG1+ cells within CD31+ endothelial structures that co-expressed SOX17+ or the endothelial marker CAV1. Collectively, these observations provide evidence for a wave of human T-cell development that originates directly from haemogenic endothelium via a RAG1+ intermediate with multilineage potential.

Publication

Mesenchymal Stromal Cells are Readily Recoverable from Lung Tissue, but not the Alveolar Space, in Healthy Humans

Publisher: Oxford University Press (OUP)

Date: 04-07-2016

DOI: 10.1002/STEM.2419

Abstract: Stromal support is critical for lung homeostasis and the maintenance of an effective epithelial barrier. Despite this, previous studies have found a positive association between the number of mesenchymal stromal cells (MSCs) isolated from the alveolar compartment and human lung diseases associated with epithelial dysfunction. We hypothesised that bronchoalveolar lavage derived MSCs (BAL-MSCs) are dysfunctional and distinct from resident lung tissue MSCs (LT-MSCs). In this study, we comprehensively interrogated the phenotype and transcriptome of human BAL-MSCs and LT-MSCs. We found that MSCs were rarely recoverable from the alveolar space in healthy humans, but could be readily isolated from lung transplant recipients by bronchoalveolar lavage. BAL-MSCs exhibited a CD90Hi, CD73Hi, CD45Neg, CD105Lo immunophenotype and were bipotent, lacking adipogenic potential. In contrast, MSCs were readily recoverable from healthy human lung tissue and were CD90Hi or Lo, CD73Hi, CD45Neg, CD105Int and had full tri-lineage potential. Transcriptional profiling of the two populations confirmed their status as bona fide MSCs and revealed a high degree of similarity between each other and the archetypal bone-marrow MSC. 105 genes were differentially expressed 76 of which were increased in BAL-MSCs including genes involved in fibroblast activation, extracellular matrix deposition and tissue remodelling. Finally, we found the fibroblast markers collagen 1A1 and α-smooth muscle actin were increased in BAL-MSCs. Our data suggests that in healthy humans, lung MSCs reside within the tissue, but in disease can differentiate to acquire a profibrotic phenotype and migrate from their in-tissue niche into the alveolar space.

Publication

A simple, scalable approach to building a cross-platform transcriptome atlas

Publisher: Public Library of Science (PLoS)

Date: 28-09-2020

DOI: 10.1371/JOURNAL.PCBI.1008219

Publication

Abstracts

Publisher: Wiley

Date: 04-2016

DOI: 10.1111/BJH.14019

Publication

Deep learning has potential for harmonising multi-omics data to discover weak regulatory features

Publisher: F1000 Research Limited

Date: 2019

DOI: 10.7490/F1000RESEARCH.1117749.1

Publication

Mapping the Blood Cell Landscape in Stemformatics

Publisher: F1000 Research Limited

Date: 2019

DOI: 10.7490/F1000RESEARCH.1116449.1

Publication

Multi-omics data integration for the discovery of COVID-19 drug targets

Publisher: F1000 Research Limited

Date: 2020

DOI: 10.7490/F1000RESEARCH.1118023.1

Publication

Mapping the blood cell landscape in stemformatics

Publisher: F1000 Research Limited

Date: 2018

DOI: 10.7490/F1000RESEARCH.1116323.1

Publication

Integrative omics identifies conserved and pathogen-specific responses of sepsis-causing bacteria

Publisher: Springer Science and Business Media LLC

Date: 18-03-2023

DOI: 10.1038/S41467-023-37200-W

Abstract: Even in the setting of optimal resuscitation in high-income countries severe sepsis and septic shock have a mortality of 20–40%, with antibiotic resistance dramatically increasing this mortality risk. To develop a reference dataset enabling the identification of common bacterial targets for therapeutic intervention, we applied a standardized genomic, transcriptomic, proteomic and metabolomic technological framework to multiple clinical isolates of four sepsis-causing pathogens: Escherichia coli , Klebsiella pneumoniae species complex, Staphylococcus aureus and Streptococcus pyogenes . Exposure to human serum generated a sepsis molecular signature containing global increases in fatty acid and lipid biosynthesis and metabolism, consistent with cell envelope remodelling and nutrient adaptation for osmoprotection. In addition, acquisition of cholesterol was identified across the bacterial species. This detailed reference dataset has been established as an open resource to support discovery and translational research.

Publication

Author Reply to Peer Reviews of Single cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

Publisher: EMBO

Date: 13-09-2022

DOI: 10.15252/RC.2022603105

Publication

Performance enhancement of thin-film amorphous silicon solar cells with low cost nanodent plasmonic substrates

Publisher: Royal Society of Chemistry (RSC)

Date: 2013

DOI: 10.1039/C3EE41139G

Publication

Single-cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

Publisher: EMBO

Date: 13-03-2023

DOI: 10.15252/EMBJ.2022112590

Abstract: During development, the lymphatic vasculature forms as a second network derived chiefly from blood vessels. The transdifferentiation of embryonic venous endothelial cells (VECs) into lymphatic endothelial cells (LECs) is a key step in this process. Specification, differentiation and maintenance of LEC fate are all driven by the transcription factor Prox1, yet the downstream mechanisms remain to be elucidated. We here present a single‐cell transcriptomic atlas of lymphangiogenesis in zebrafish, revealing new markers and hallmarks of LEC differentiation over four developmental stages. We further profile single‐cell transcriptomic and chromatin accessibility changes in zygotic prox1a mutants that are undergoing a LEC‐VEC fate shift. Using maternal and zygotic prox1a rox1b mutants, we determine the earliest transcriptomic changes directed by Prox1 during LEC specification. This work altogether reveals new downstream targets and regulatory regions of the genome controlled by Prox1 and presents evidence that Prox1 specifies LEC fate primarily by limiting blood vascular and haematopoietic fate. This extensive single‐cell resource provides new mechanistic insights into the enigmatic role of Prox1 and the control of LEC differentiation in development.

Publication

Using intermolecular interactions to crosslink PIM-1 and modify its gas sorption properties

Publisher: Royal Society of Chemistry (RSC)

Date: 2015

DOI: 10.1039/C4TA06070A

Abstract: The attractive intermolecular interactions between PIM-1 and polycyclic aromatic hydrocarbons were used to produce films with higher CO 2 /N 2 gas sorption selectivity and reduced ageing of permeability.

Publication

A survey of current resources to study lncrna-protein interactions

Publisher: MDPI AG

Date: 08-06-2021

DOI: 10.3390/NCRNA7020033

Abstract: Phenotypes are driven by regulated gene expression, which in turn are mediated by complex interactions between erse biological molecules. Protein–DNA interactions such as histone and transcription factor binding are well studied, along with RNA–RNA interactions in short RNA silencing of genes. In contrast, lncRNA-protein interaction (LPI) mechanisms are comparatively unknown, likely directed by the difficulties in studying LPI. However, LPI are emerging as key interactions in epigenetic mechanisms, playing a role in development and disease. Their importance is further highlighted by their conservation across kingdoms. Hence, interest in LPI research is increasing. We therefore review the current state of the art in lncRNA-protein interactions. We specifically surveyed recent computational methods and databases which researchers can exploit for LPI investigation. We discovered that algorithm development is heavily reliant on a few generic databases containing curated LPI information. Additionally, these databases house information at gene-level as opposed to transcript-level annotations. We show that early methods predict LPI using molecular docking, have limited scope and are slow, creating a data processing bottleneck. Recently, machine learning has become the strategy of choice in LPI prediction, likely due to the rapid growth in machine learning infrastructure and expertise. While many of these methods have notable limitations, machine learning is expected to be the basis of modern LPI prediction algorithms.

Publication

A molecular classification of human mesenchymal stromal cells

Publisher: PeerJ

Date: 24-03-2016

DOI: 10.7717/PEERJ.1845

Abstract: Mesenchymal stromal cells (MSC) are widely used for the study of mesenchymal tissue repair, and increasingly adopted for cell therapy, despite the lack of consensus on the identity of these cells. In part this is due to the lack of specificity of MSC markers. Distinguishing MSC from other stromal cells such as fibroblasts is particularly difficult using standard analysis of surface proteins, and there is an urgent need for improved classification approaches. Transcriptome profiling is commonly used to describe and compare different cell types however, efforts to identify specific markers of rare cellular subsets may be confounded by the small s le sizes of most studies. Consequently, it is difficult to derive reproducible, and therefore useful markers. We addressed the question of MSC classification with a large integrative analysis of many public MSC datasets. We derived a sparse classifier (The Rohart MSC test) that accurately distinguished MSC from non-MSC s les with % accuracy on an internal training set of 635 s les from 41 studies derived on 10 different microarray platforms. The classifier was validated on an external test set of 1,291 s les from 65 studies derived on 15 different platforms, with % accuracy. The genes that contribute to the MSC classifier formed a protein-interaction network that included known MSC markers. Further evidence of the relevance of this new MSC panel came from the high number of Mendelian disorders associated with mutations in more than 65% of the network. These result in mesenchymal defects, particularly impacting on skeletal growth and function. The Rohart MSC test is a simple in silico test that accurately discriminates MSC from fibroblasts, other adult stem rogenitor cell types or differentiated stromal cells. It has been implemented in the www.stemformatics.org resource, to assist researchers wishing to benchmark their own MSC datasets or data from the public domain. The code is available from the CRAN repository and all data used to generate the MSC test is available to download via the Gene Expression Omnibus or the Stemformatics resource.

Related Organisations

Organisation

Beijing Institute Of Technology

Location: China

View Organisation

Organisation

Anhui Normal University

Location: China

View Organisation

Organisation

Huazhong University Of Science And Technology

Location: China

View Organisation

Organisation

Hanyang University

Location: Korea, Republic of

View Organisation

Organisation

National Institute For Materials Science

Location: Japan

View Organisation

Organisation

Australian National University

Location: Australia

View Organisation

Organisation

Monash University

Location: Australia

View Organisation

Organisation

University Of Queensland

Location: Australia

View Organisation

Organisation

Institute Of Semiconductors, Chinese Academy Of Sciences

Location: China

View Organisation

Organisation

University Of Southern California

Location: United States of America

View Organisation

Organisation

University Of Science And Technology Of China

Location: China

View Organisation

Organisation

Peter MacCallum Cancer Centre

Location: Australia

View Organisation

Organisation

University Of Melbourne

Location: Australia

View Organisation

Organisation

Alfred Health

Location: Australia

View Organisation

Related Funding Activities

No related grants have been discovered for Tyrone Chen.

Tyrone Chen

Researcher

Related Links

Publications

A simple, scalable approach to building a cross-platform transcriptome atlas

Integrative computational epigenomics to build data-driven gene regulation hypotheses

Stemformatics: Easy visualisation platform for well-curated stem cell data

Stemformatics – visualise and download curated stem cell data

Tyrone Chen: Data Fluency Digital Toolkit Poster - Minor Prize

Stemformatics: a visualisation platform for well-curated biological sequence data

Integrated Photo-supercapacitor Based on Bi-polar TiO2Nanotube Arrays with Selective One-Side Plasma-Assisted Hydrogenation

Multi-omics data harmonisation for the discovery of COVID-19 drug targets

Multi-omics data harmonisation for the discovery of COVID-19 drug targets

An annotation-free format for representing multimodal data features

genomicBERT and data-free deep-learning model evaluation

Bioinformatics training with a little help from my friends

Stemformatics: visualize and download curated stem cell data.

multiomics: A user-friendly multi-omics data harmonisation R pipeline

multiomics: A user-friendly multi-omics data harmonisation R pipeline

Multipotent RAG1+ progenitors emerge directly from haemogenic endothelium in human pluripotent stem cell-derived haematopoietic organoids.

Mesenchymal Stromal Cells are Readily Recoverable from Lung Tissue, but not the Alveolar Space, in Healthy Humans

A simple, scalable approach to building a cross-platform transcriptome atlas

Abstracts

Deep learning has potential for harmonising multi-omics data to discover weak regulatory features

Mapping the Blood Cell Landscape in Stemformatics

Multi-omics data integration for the discovery of COVID-19 drug targets

Mapping the blood cell landscape in stemformatics

Integrative omics identifies conserved and pathogen-specific responses of sepsis-causing bacteria

Author Reply to Peer Reviews of Single cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

Performance enhancement of thin-film amorphous silicon solar cells with low cost nanodent plasmonic substrates

Single-cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

Using intermolecular interactions to crosslink PIM-1 and modify its gas sorption properties

A survey of current resources to study lncrna-protein interactions

A molecular classification of human mesenchymal stromal cells

Related Organisations

Beijing Institute Of Technology

Anhui Normal University

Huazhong University Of Science And Technology

Hanyang University

National Institute For Materials Science

Australian National University

Monash University

University Of Queensland

Institute Of Semiconductors, Chinese Academy Of Sciences

University Of Southern California

University Of Science And Technology Of China

Peter MacCallum Cancer Centre

University Of Melbourne

Alfred Health

Related Funding Activities

ARDC NEWSLETTER SIGNUP

Integrated Photo-supercapacitor Based on Bi-polar TiO₂Nanotube Arrays with Selective One-Side Plasma-Assisted Hydrogenation