ARDC Research Link Australia

Publication

A Pseudo-Temporal Causality Approach to Identifying miRNA-mRNA Interactions During Biological Processes

Publisher: Cold Spring Harbor Laboratory

Date: 08-07-2020

DOI: 10.1101/2020.07.07.192724

Abstract: microRNAs (miRNAs) are important gene regulators and they are involved in many biological processes, including cancer progression. Therefore, correctly identifying miRNA-mRNA interactions is a crucial task. To this end, a huge number of computational methods has been developed, but they mainly use the data at one snapshot and ignore the dynamics of a biological process. The recent development of single cell data and the booming of the exploration of cell trajectories using “pseudo-time” concept have inspired us to develop a pseudo-time based method to infer the miRNA-mRNA relationships characterising a biological process by taking into account the temporal aspect of the process. We have developed a novel approach, called pseudo-time causality (PTC), to find the causal relationships between miRNAs and mRNAs during a biological process. We have applied the proposed method to both single cell and bulk sequencing datasets for Epithelia to Mesenchymal Transition (EMT), a key process in cancer metastasis. The evaluation results show that our method significantly outperforms existing methods in finding miRNA-mRNA interactions in both single cell and bulk data. The results suggest that utilising the pseudo-temporal information from the data helps reveal the gene regulation in a biological process much better than using the static information. R scripts and datasets can be found at github.com/AndresMCB/PTC

Publication

Stable breast cancer prognosis

Publisher: Cold Spring Harbor Laboratory

Date: 15-09-2021

DOI: 10.1101/2021.09.13.460002

Abstract: Predicting breast cancer prognosis helps improve the treatment and management of the disease. In the last decades, many prediction models have been developed for breast cancer prognosis based on transcriptomic data. A common assumption made by these models is that the test and training data follow the same distribution. However, in practice, due to the heterogeneity of breast cancer and the different environments (e.g. hospitals) where data are collected, the distribution of the test data may shift from that of the training data. For ex le, new patients likely have different breast cancer stage distribution from those in the training dataset. Thus these existing methods may not provide stable prediction performance for breast cancer prognosis in situations with the shift of data distribution. In this paper, we present a novel stable prediction method for reliable breast cancer prognosis under data distribution shift. Our model, known as Deep Global Balancing Cox regression (DGBCox), is based on the causal inference theory. In DGBCox, firstly high-dimensional gene expression data is transferred to latent network-based representations by a deep auto-encoder neural network. Then after balancing the latent representations using a proposed causality-based approach, causal latent features are selected for breast cancer prognosis. Causal features have persistent relationships with survival outcomes even under distribution shift across different environments according to the causal inference theory. Therefore, the proposed DGBCox method is robust and stable for breast cancer prognosis. We apply DGBCox to 12 test datasets from different breast cancer studies. The results show that DGBCox outperforms benchmark methods in terms of both prediction accuracy and stability. We also propose a permutation importance algorithm to rank the genes in the DGBCox model. The top 50 ranked genes suggest that the cell cycle and the organelle organisation could be the most relevant biological processes for stable breast cancer prognosis. Various prediction models have been proposed for breast cancer prognosis. The prediction models usually train on a dataset and predict the survival outcomes of patients in new test datasets. The majority of these models share a common assumption that the test and training data follow the same distribution. However, as breast cancer is a heterogeneous disease, the assumption may be violated in practice. In this study, we propose a novel method for reliable breast cancer prognosis when the test data distribution shifts from that of the training data. The proposed model has been trained on one dataset and applied to twelve test datasets from different breast cancer studies. In comparison with the benchmark methods in breast cancer prognosis, our model shows better prediction accuracy and stability. The top 50 important genes in our model provide clues to the relationship between several biological mechanisms and clinical outcomes of breast cancer. Our proposed method in breast cancer can potentially be adapted to apply to other cancer types.

Publication

A novel single-cell based method for breast cancer prognosis

Publisher: Cold Spring Harbor Laboratory

Date: 28-04-2020

DOI: 10.1101/2020.04.26.062794

Abstract: Breast cancer prognosis is challenging due to the heterogeneity of the disease. Various computational methods using bulk RNA-seq data have been proposed for breast cancer prognosis. However, these methods suffer from limited performances or ambiguous biological relevance, as a result of the neglect of intra-tumor heterogeneity. Recently, single cell RNA-sequencing (scRNA-seq) has emerged for studying tumor heterogeneity at cellular levels. In this paper, we propose a novel method, scPrognosis , to improve breast cancer prognosis with scRNA-seq data. scPrognosis uses the scRNA-seq data of the biological process Epithelial-to-Mesenchymal Transition (EMT). It firstly infers the EMT pseudotime and a dynamic gene co-expression network, then uses an integrative model to select genes important in EMT based on their expression variation and differentiation in different stages of EMT, and their roles in the dynamic gene co-expression network. To validate and apply the selected signatures to breast cancer prognosis, we use them as the features to build a prediction model with bulk RNA-seq data. The experimental results show that scPrognosis outperforms other benchmark breast cancer prognosis methods that use bulk RNA-seq data. Moreover, the dynamic changes in the expression of the selected signature genes in EMT may provide clues to the link between EMT and clinical outcomes of breast cancer. scPrognosis will also be useful when applied to scRNA-seq datasets of different biological processes other than EMT. Various computational methods have been developed for breast cancer prognosis. However, those methods mainly use the gene expression data generated by the bulk RNA sequencing techniques, which average the expression level of a gene across different cell types. As breast cancer is a heterogenous disease, the bulk gene expression may not be the ideal resource for cancer prognosis. In this study, we propose a novel method to improve breast cancer prognosis using scRNA-seq data. The proposed method has been applied to the EMT scRNA-seq dataset for identifying breast cancer signatures for prognosis. In comparison with existing bulk expression data based methods in breast cancer prognosis, our method shows a better performance. Our single-cell-based signatures provide clues to the relation between EMT and clinical outcomes of breast cancer. In addition, the proposed method can also be useful when applied to scRNA-seq datasets of different biological processes other than EMT.

Publication

Stabilising Job Survival Analysis for Disability Employment Services in Unseen Environments

Publisher: ACM

Date: 04-08-2023

DOI: 10.1145/3580305.3599908

Publication

A pseudotemporal causality approach to identifying miRNA–mRNA interactions during biological processes

Publisher: Oxford University Press (OUP)

Date: 18-10-2021

DOI: 10.1093/BIOINFORMATICS/BTAA899

Abstract: microRNAs (miRNAs) are important gene regulators and they are involved in many biological processes, including cancer progression. Therefore, correctly identifying miRNA–mRNA interactions is a crucial task. To this end, a huge number of computational methods has been developed, but they mainly use the data at one snapshot and ignore the dynamics of a biological process. The recent development of single cell data and the booming of the exploration of cell trajectories using ‘pseudotime’ concept have inspired us to develop a pseudotime-based method to infer the miRNA–mRNA relationships characterizing a biological process by taking into account the temporal aspect of the process. We have developed a novel approach, called pseudotime causality, to find the causal relationships between miRNAs and mRNAs during a biological process. We have applied the proposed method to both single cell and bulk sequencing datasets for Epithelia to Mesenchymal Transition, a key process in cancer metastasis. The evaluation results show that our method significantly outperforms existing methods in finding miRNA–mRNA interactions in both single cell and bulk data. The results suggest that utilizing the pseudotemporal information from the data helps reveal the gene regulation in a biological process much better than using the static information. R scripts and datasets can be found at github.com/AndresMCB/PTC. Supplementary data are available at Bioinformatics online.

Publication

Identifying preeclampsia-associated genes using a control theory method

Publisher: Oxford University Press (OUP)

Date: 28-04-2022

DOI: 10.1093/BFGP/ELAC006

Abstract: Preecl sia is a pregnancy-specific disease that can have serious effects on the health of both mothers and their offspring. Predicting which women will develop preecl sia in early pregnancy with high accuracy will allow for improved management. The clinical symptoms of preecl sia are well recognized, however, the precise molecular mechanisms leading to the disorder are poorly understood. This is compounded by the heterogeneous nature of preecl sia onset, timing and severity. Indeed a multitude of poorly defined causes including genetic components implicates etiologic factors, such as immune maladaptation, placental ischemia and increased oxidative stress. Large datasets generated by microarray and next-generation sequencing have enabled the comprehensive study of preecl sia at the molecular level. However, computational approaches to simultaneously analyze the preecl sia transcriptomic and network data and identify clinically relevant information are currently limited. In this paper, we proposed a control theory method to identify potential preecl sia-associated genes based on both transcriptomic and network data. First, we built a preecl sia gene regulatory network and analyzed its controllability. We then defined two types of critical preecl sia-associated genes that play important roles in the constructed preecl sia-specific network. Benchmarking against differential expression, betweenness centrality and hub analysis we demonstrated that the proposed method may offer novel insights compared with other standard approaches. Next, we investigated subtype specific genes for early and late onset preecl sia. This control theory approach could contribute to a further understanding of the molecular mechanisms contributing to preecl sia.

Publication

miRspongeR 2.0: an enhanced R package for exploring miRNA sponge regulation

Publisher: Oxford University Press (OUP)

Date: 2022

DOI: 10.1093/BIOADV/VBAC063

Abstract: MicroRNA (miRNA) sponges influence the capability of miRNA-mediated gene silencing by competing for shared miRNA response elements and play significant roles in many physiological and pathological processes. It has been proved that computational or dry-lab approaches are useful to guide wet-lab experiments for uncovering miRNA sponge regulation. However, all of the existing tools only allow the analysis of miRNA sponge regulation regarding a group of s les, rather than the miRNA sponge regulation unique to in idual s les. Furthermore, most existing tools do not allow parallel computing for the fast identification of miRNA sponge regulation. Here, we present an enhanced version of our R/Bioconductor package, miRspongeR 2.0. Compared with the original version introduced in 2019, this package extends the resolution of miRNA sponge regulation from the multi-s le level to the single-s le level. Moreover, it supports the identification of miRNA sponge networks using parallel computing, and the construction of s le–s le correlation networks. It also provides more computational methods to infer miRNA sponge regulation and expands the ground truth for validation. With these new features, we anticipate that miRspongeR 2.0 will further accelerate the research on miRNA sponges with higher resolution and more utilities. ackages/miRspongeR/. Supplementary data are available at Bioinformatics Advances online.

Publication

Multi-Group Transfer Learning on Multiple Latent Spaces for Text Classification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2020

DOI: 10.1109/ACCESS.2020.2984571

Publication

Gene selection for optimal prediction of cell position in tissues from single-cell transcriptomics data

Publisher: Life Science Alliance, LLC

Date: 24-09-2020

DOI: 10.26508/LSA.202000867

Abstract: Single-cell RNA-sequencing (scRNAseq) technologies are rapidly evolving. Although very informative, in standard scRNAseq experiments, the spatial organization of the cells in the tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to maintain cell localization have limited throughput and gene coverage. Mapping scRNAseq to genes with spatial information increases coverage while providing spatial location. However, methods to perform such mapping have not yet been benchmarked. To fill this gap, we organized the DREAM Single-Cell Transcriptomics challenge focused on the spatial reconstruction of cells from the Drosophila embryo from scRNAseq data, leveraging as silver standard, genes with in situ hybridization data from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used erse algorithms for gene selection and location prediction, while being able to correctly localize clusters of cells. Selection of predictor genes was essential for this task. Predictor genes showed a relatively high expression entropy, high spatial clustering and included prominent developmental genes such as gap and pair-rule genes and tissue markers. Application of the top 10 methods to a zebra fish embryo dataset yielded similar performance and statistical properties of the selected genes than in the Drosophila data. This suggests that methods developed in this challenge are able to extract generalizable properties of genes that are useful to accurately reconstruct the spatial arrangement of cells in tissues.

Publication

PAN: Personalized Annotation-Based Networks for the Prediction of Breast Cancer Relapse

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2021

DOI: 10.1109/TCBB.2021.3076422

Publication

A novel single-cell based method for breast cancer prognosis

Publisher: Public Library of Science (PLoS)

Date: 24-08-2020

DOI: 10.1371/JOURNAL.PCBI.1008133

Publication

The winning methods for predicting cellular position in the DREAM single-cell transcriptomics challenge

Publisher: Oxford University Press (OUP)

Date: 25-08-2021

DOI: 10.1093/BIB/BBAA181

Abstract: Predicting cell locations is important since with the understanding of cell locations, we may estimate the function of cells and their integration with the spatial environment. Thus, the DREAM challenge on single-cell transcriptomics required participants to predict the locations of single cells in the Drosophila embryo using single-cell transcriptomic data. We have developed over 50 pipelines by combining different ways of preprocessing the RNA-seq data, selecting the genes, predicting the cell locations and validating predicted cell locations, resulting in the winning methods which were ranked second in sub-challenge 1, first in sub-challenge 2 and third in sub-challenge 3. In this paper, we present an R package, SCTCwhatateam, which includes all the methods we developed and the Shiny web application to facilitate the research on single-cell spatial reconstruction. All the data and the ex le use cases are available in the Supplementary data.

Publication

Uncovering the roles of microRNAs/lncRNAs in characterising breast cancer subtypes and prognosis

Publisher: Springer Science and Business Media LLC

Date: 04-06-2021

DOI: 10.1186/S12859-021-04215-3

Abstract: Accurate prognosis and identification of cancer subtypes at molecular level are important steps towards effective and personalised treatments of breast cancer. To this end, many computational methods have been developed to use gene (mRNA) expression data for breast cancer subtyping and prognosis. Meanwhile, microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) have been extensively studied in the last 2 decades and their associations with breast cancer subtypes and prognosis have been evidenced. However, it is not clear whether using miRNA and/or lncRNA expression data helps improve the performance of gene expression based subtyping and prognosis methods, and this raises challenges as to how and when to use these data and methods in practice. In this paper, we conduct a comparative study of 35 methods, including 12 breast cancer subtyping methods and 23 breast cancer prognosis methods, on a collection of 19 independent breast cancer datasets. We aim to uncover the roles of miRNAs and lncRNAs in breast cancer subtyping and prognosis from the systematic comparison. In addition, we created an R package, CancerSubtypesPrognosis, including all the 35 methods to facilitate the reproducibility of the methods and streamline the evaluation. The experimental results show that integrating miRNA expression data helps improve the performance of the mRNA-based cancer subtyping methods. However, miRNA signatures are not as good as mRNA signatures for breast cancer prognosis. In general, lncRNA expression data does not help improve the mRNA-based methods in both cancer subtyping and cancer prognosis. These results suggest that the prognostic roles of miRNA/lncRNA signatures in the improvement of breast cancer prognosis needs to be further verified.

Publication

Role of adipose tissue in the pathogenesis of cardiac arrhythmias

Publisher: Elsevier BV

Date: 2016

DOI: 10.1016/J.HRTHM.2015.08.016

Abstract: Epicardial adipose tissue is present in normal healthy in iduals. It is a unique fat depot that, under physiologic conditions, plays a cardioprotective role. However, excess epicardial adipose tissue has been shown to be associated with prevalence and severity of atrial fibrillation. In arrhythmogenic right ventricular cardiomyopathy and myotonic dystrophy, fibrofatty infiltration of the myocardium is associated with ventricular arrhythmias. In the ovine model of ischemic cardiomyopathy, the presence of intramyocardial adipose or lipomatous metaplasia has been associated with increased propensity to ventricular tachycardia. These observations suggest a role of adipose tissue in the pathogenesis of cardiac arrhythmias. In this article, we review the role of cardiac adipose tissue in various cardiac arrhythmias and discuss the possible pathophysiologic mechanisms.

Xiaomei Li

Researcher

Publications

A Pseudo-Temporal Causality Approach to Identifying miRNA-mRNA Interactions During Biological Processes

Stable breast cancer prognosis

A novel single-cell based method for breast cancer prognosis

Stabilising Job Survival Analysis for Disability Employment Services in Unseen Environments

A pseudotemporal causality approach to identifying miRNA–mRNA interactions during biological processes

Identifying preeclampsia-associated genes using a control theory method

miRspongeR 2.0: an enhanced R package for exploring miRNA sponge regulation

Multi-Group Transfer Learning on Multiple Latent Spaces for Text Classification

Gene selection for optimal prediction of cell position in tissues from single-cell transcriptomics data

PAN: Personalized Annotation-Based Networks for the Prediction of Breast Cancer Relapse

A novel single-cell based method for breast cancer prognosis

The winning methods for predicting cellular position in the DREAM single-cell transcriptomics challenge

Uncovering the roles of microRNAs/lncRNAs in characterising breast cancer subtypes and prognosis

Role of adipose tissue in the pathogenesis of cardiac arrhythmias

Related Organisations

University Of South Australia

Commonwealth Scientific And Industrial Research Organisation

CSIRO

Related Funding Activities

Xiaomei Li

Researcher

Related Links

Publications

A Pseudo-Temporal Causality Approach to Identifying miRNA-mRNA Interactions During Biological Processes

Stable breast cancer prognosis

A novel single-cell based method for breast cancer prognosis

Stabilising Job Survival Analysis for Disability Employment Services in Unseen Environments

A pseudotemporal causality approach to identifying miRNA–mRNA interactions during biological processes

Identifying preeclampsia-associated genes using a control theory method

miRspongeR 2.0: an enhanced R package for exploring miRNA sponge regulation

Multi-Group Transfer Learning on Multiple Latent Spaces for Text Classification

Gene selection for optimal prediction of cell position in tissues from single-cell transcriptomics data

PAN: Personalized Annotation-Based Networks for the Prediction of Breast Cancer Relapse

A novel single-cell based method for breast cancer prognosis

The winning methods for predicting cellular position in the DREAM single-cell transcriptomics challenge

Uncovering the roles of microRNAs/lncRNAs in characterising breast cancer subtypes and prognosis

Role of adipose tissue in the pathogenesis of cardiac arrhythmias

Related Organisations

University Of South Australia

Commonwealth Scientific And Industrial Research Organisation

CSIRO

Related Funding Activities

ARDC NEWSLETTER SIGNUP