ORCID Profile
0000-0002-9023-1878
Current Organisations
Griffith University
,
University of South Australia
,
Yunnan University
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Artificial Intelligence and Image Processing | Pattern Recognition and Data Mining | Information Storage, Retrieval And Management | Artificial Intelligence and Image Processing not elsewhere classified | Bioinformatics Software | Expert Systems | Machine learning not elsewhere classified | Library and Information Studies | Data mining and knowledge discovery | Data Security | Data management and data science
Computer Software and Services not elsewhere classified | Information processing services | Information and Communication Services not elsewhere classified | Application packages | Computer software and services not elsewhere classified | Information Services not elsewhere classified |
Publisher: Elsevier BV
Date: 05-2019
Publisher: Springer Berlin Heidelberg
Date: 2012
Publisher: ACM
Date: 14-08-2022
Publisher: Springer Science and Business Media LLC
Date: 12-2019
DOI: 10.1186/S12859-019-3215-5
Abstract: Studying multiple microRNAs (miRNAs) synergism in gene regulation could help to understand the regulatory mechanisms of complicated human diseases caused by miRNAs. Several existing methods have been presented to infer miRNA synergism. Most of the current methods assume that miRNAs with shared targets at the sequence level are working synergistically. However, it is unclear if miRNAs with shared targets are working in concert to regulate the targets or they in idually regulate the targets at different time points or different biological processes. A standard method to test the synergistic activities is to knock-down multiple miRNAs at the same time and measure the changes in the target genes. However, this approach may not be practical as we would have too many sets of miRNAs to test. n this paper, we present a novel framework called miRsyn for inferring miRNA synergism by using a causal inference method that mimics the multiple-intervention experiments, e.g. knocking-down multiple miRNAs, with observational data. Our results show that several miRNA-miRNA pairs that have shared targets at the sequence level are not working synergistically at the expression level. Moreover, the identified miRNA synergistic network is small-world and biologically meaningful, and a number of miRNA synergistic modules are significantly enriched in breast cancer. Our further analyses also reveal that most of synergistic miRNA-miRNA pairs show the same expression patterns. The comparison results indicate that the proposed multiple-intervention causal inference method performs better than the single-intervention causal inference method in identifying miRNA synergistic network. Taken together, the results imply that miRsyn is a promising framework for identifying miRNA synergism, and it could enhance the understanding of miRNA synergism in breast cancer.
Publisher: International Joint Conferences on Artificial Intelligence Organization
Date: 07-2022
Abstract: Unobserved confounding is the main obstacle to causal effect estimation from observational data. Instrumental variables (IVs) are widely used for causal effect estimation when there exist latent confounders. With the standard IV method, when a given IV is valid, unbiased estimation can be obtained, but the validity requirement on a standard IV is strict and untestable. Conditional IVs have been proposed to relax the requirement of standard IVs by conditioning on a set of observed variables (known as a conditioning set for a conditional IV). However, the criterion for finding a conditioning set for a conditional IV needs a directed acyclic graph (DAG) representing the causal relationships of both observed and unobserved variables. This makes it challenging to discover a conditioning set directly from data. In this paper, by leveraging maximal ancestral graphs (MAGs) for causal inference with latent variables, we study the graphical properties of ancestral IVs, a type of conditional IVs using MAGs, and develop the theory to support data-driven discovery of the conditioning set for a given ancestral IV in data under the pretreatment variable assumption. Based on the theory, we develop an algorithm for unbiased causal effect estimation with a given ancestral IV and observational data. Extensive experiments on synthetic and real-world datasets demonstrate the performance of the algorithm in comparison with existing IV methods.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 12-2018
Publisher: Public Library of Science (PLoS)
Date: 30-12-2015
Publisher: Springer Science and Business Media LLC
Date: 10-10-2017
Publisher: American Astronomical Society
Date: 05-06-2017
Publisher: Springer Science and Business Media LLC
Date: 08-2018
Publisher: Elsevier BV
Date: 12-2018
DOI: 10.1016/J.IJMEDINF.2018.09.002
Abstract: Adverse drug events (ADEs) are among the top causes of hospitalization and death. Social media is a promising open data source for the timely detection of potential ADEs. In this paper, we study the problem of detecting signals of ADEs from social media. Detecting ADEs whose drug and AE may be reported in different posts of a user leads to major concerns regarding the content authenticity and user credibility, which have not been addressed in previous studies. Content authenticity concerns whether a post mentions drugs or adverse events that are actually consumed or experienced by the writer. User credibility indicates the degree to which chronological evidence from a user's sequence of posts should be trusted in the ADE detection. We propose AC-SPASM, a Bayesian model for the authenticity and credibility aware detection of ADEs from social media. The model exploits the interaction between content authenticity, user credibility and ADE signal quality. In particular, we argue that the credibility of a user correlates with the user's consistency in reporting authentic content. We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F Our study demonstrates that taking into account the content authenticity and user credibility improves the detection of ADEs from social media. Our work generates hypotheses to reduce experts' guesswork in identifying unknown potential ADEs.
Publisher: Hogrefe Publishing Group
Date: 04-2016
DOI: 10.1027/1015-5759/A000242
Abstract: Abstract. Studies on the construct validity of the Self-Description Questionnaire II (SDQII) have not compared the factor structure between the English and Chinese versions of the SDQII. By using rigorous multiple group comparison procedures based upon confirmatory factor analysis (CFA) of measurement invariance, the present study examined the responses of Australian high school students (N = 302) and Chinese high school students (N = 322) using the English and Chinese versions of the SDQII, respectively. CFA provided strong evidence that the factor structure (factor loading and item intercept) of the Chinese version of the SDQII in comparison to responses to the English version of the SDQII is invariant, therefore it allows researchers to confidently utilize both the English and Chinese versions of the SDQII with Chinese and Australian s les separately and cross-culturally.
Publisher: arXiv
Date: 2022
Publisher: Springer Science and Business Media LLC
Date: 30-10-2023
Publisher: CRC Press
Date: 06-11-2013
DOI: 10.1201/B15618
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2023
Publisher: Springer Science and Business Media LLC
Date: 04-06-2021
DOI: 10.1186/S12859-021-04215-3
Abstract: Accurate prognosis and identification of cancer subtypes at molecular level are important steps towards effective and personalised treatments of breast cancer. To this end, many computational methods have been developed to use gene (mRNA) expression data for breast cancer subtyping and prognosis. Meanwhile, microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) have been extensively studied in the last 2 decades and their associations with breast cancer subtypes and prognosis have been evidenced. However, it is not clear whether using miRNA and/or lncRNA expression data helps improve the performance of gene expression based subtyping and prognosis methods, and this raises challenges as to how and when to use these data and methods in practice. In this paper, we conduct a comparative study of 35 methods, including 12 breast cancer subtyping methods and 23 breast cancer prognosis methods, on a collection of 19 independent breast cancer datasets. We aim to uncover the roles of miRNAs and lncRNAs in breast cancer subtyping and prognosis from the systematic comparison. In addition, we created an R package, CancerSubtypesPrognosis, including all the 35 methods to facilitate the reproducibility of the methods and streamline the evaluation. The experimental results show that integrating miRNA expression data helps improve the performance of the mRNA-based cancer subtyping methods. However, miRNA signatures are not as good as mRNA signatures for breast cancer prognosis. In general, lncRNA expression data does not help improve the mRNA-based methods in both cancer subtyping and cancer prognosis. These results suggest that the prognostic roles of miRNA/lncRNA signatures in the improvement of breast cancer prognosis needs to be further verified.
Publisher: IEEE
Date: 11-2011
Publisher: Elsevier BV
Date: 02-2012
Publisher: Oxford University Press (OUP)
Date: 07-12-2017
Publisher: Scientific Societies
Date: 12-2013
DOI: 10.1094/PHYTO-01-13-0023-R
Abstract: The online community resource Phytophthora database (PD) was developed to support accurate and rapid identification of Phytophthora and to help characterize and catalog the ersity and evolutionary relationships within the genus. Since its release in 2008, the sequence database has grown to cover 1 to 12 loci for ≈2,600 isolates (representing 138 described and provisional species). Sequences of multiple mitochondrial loci were added to complement nuclear loci-based phylogenetic analyses and diagnostic tool development. Key characteristics of most newly described and provisional species have been summarized. Other additions to improve the PD functionality include: (i) geographic information system tools that enable users to visualize the geographic origins of chosen isolates on a global-scale map, (ii) a tool for comparing genetic similarity between isolates via microsatellite markers to support population genetic studies, (iii) a comprehensive review of molecular diagnostics tools and relevant references, (iv) sequence alignments used to develop polymerase chain reaction-based diagnostics tools to support their utilization and new diagnostic tool development, and (v) an online community forum for sharing and preserving experience and knowledge accumulated in the global Phytophthora community. Here we present how these improvements can support users and discuss the PD's future direction.
Publisher: Elsevier BV
Date: 12-2018
DOI: 10.1016/J.IJMEDINF.2018.10.003
Abstract: Adverse drug events (ADEs) are among the top causes of hospitalization and death. Social media is a promising open data source for the timely detection of potential ADEs. In this paper, we study the problem of detecting signals of ADEs from social media. Detecting ADEs whose drug and AE may be reported in different posts of a user leads to major concerns regarding the content authenticity and user credibility, which have not been addressed in previous studies. Content authenticity concerns whether a post mentions drugs or adverse events that are actually consumed or experienced by the writer. User credibility indicates the degree to which chronological evidence from a user's sequence of posts should be trusted in the ADE detection. We propose AC-SPASM, a Bayesian model for the authenticity and credibility aware detection of ADEs from social media. The model exploits the interaction between content authenticity, user credibility and ADE signal quality. In particular, we argue that the credibility of a user correlates with the user's consistency in reporting authentic content. We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F Our study demonstrates that taking into account the content authenticity and user credibility improves the detection of ADEs from social media. Our work generates hypotheses to reduce experts' guesswork in identifying unknown potential ADEs.
Publisher: IEEE
Date: 04-2008
Publisher: Springer Science and Business Media LLC
Date: 15-03-2019
Publisher: Springer International Publishing
Date: 2015
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2023
Publisher: Public Library of Science (PLoS)
Date: 02-12-2019
Publisher: Oxford University Press (OUP)
Date: 05-06-2016
DOI: 10.1093/BIB/BBW042
Abstract: Recent findings show that coding genes are not the only targets that miRNAs interact with. In fact, there is a pool of different RNAs competing with each other to attract miRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The ceRNAs indirectly regulate each other via the titration mechanism, i.e. the increasing concentration of a ceRNA will decrease the number of miRNAs that are available for interacting with other targets. The cross-talks between ceRNAs, i.e. their interactions mediated by miRNAs, have been identified as the drivers in many disease conditions, including cancers. In recent years, some computational methods have emerged for identifying ceRNA-ceRNA interactions. However, there remain great challenges and opportunities for developing computational methods to provide new insights into ceRNA regulatory mechanisms.In this paper, we review the publically available databases of ceRNA-ceRNA interactions and the computational methods for identifying ceRNA-ceRNA interactions (also known as miRNA sponge interactions). We also conduct a comparison study of the methods with a breast cancer dataset. Our aim is to provide a current snapshot of the advances of the computational methods in identifying miRNA sponge interactions and to discuss the remaining challenges.
Publisher: IEEE
Date: 2009
Publisher: Oxford University Press (OUP)
Date: 27-04-2021
DOI: 10.1093/BIOINFORMATICS/BTAB262
Abstract: Unravelling cancer driver genes is important in cancer research. Although computational methods have been developed to identify cancer drivers, most of them detect cancer drivers at population level. However, two patients who have the same cancer type and receive the same treatment may have different outcomes because each patient has a different genome and their disease might be driven by different driver genes. Therefore new methods are being developed for discovering cancer drivers at in idual level, but existing personalized methods only focus on coding drivers while microRNAs (miRNAs) have been shown to drive cancer progression as well. Thus, novel methods are required to discover both coding and miRNA cancer drivers at in idual level. We propose the novel method, pDriver, to discover personalized cancer drivers. pDriver includes two stages: (i) constructing gene networks for each cancer patient and (ii) discovering cancer drivers for each patient based on the constructed gene networks. To demonstrate the effectiveness of pDriver, we have applied it to five TCGA cancer datasets and compared it with the state-of-the-art methods. The result indicates that pDriver is more effective than other methods. Furthermore, pDriver can also detect miRNA cancer drivers and most of them have been confirmed to be associated with cancer by literature. We further analyze the predicted personalized drivers for breast cancer patients and the result shows that they are significantly enriched in many GO processes and KEGG pathways involved in breast cancer. pDriver is available at vvhoang Driver. Supplementary data are available at Bioinformatics online.
Publisher: Springer Netherlands
Date: 19-12-2012
DOI: 10.1007/978-94-007-5590-1_14
Abstract: microRNAs (miRNAs) are small non-coding RNAs that cause mRNA degradation and translation inhibition. They are pivotal regulators of development and cellular homeostasis through their control of erse processes. Recently, great efforts have been made to elucidate many targets that are affected by miRNAs, but the functions of most miRNAs and their precise regulatory mechanisms remain elusive. With more and more matched expression profiles of miRNAs and mRNAs having been made available, it is of great interest to utilize both expression profiles and sequence information to discover the functional regulatory networks of miRNAs and their target mRNAs for potential biological processes that they may participate in. In this chapter, we first briefly review the computational methods for discovering miRNA targets and miRNA-mRNA regulatory modules, and then focus on a method of identifying functional miRNA-mRNA regulatory modules by integrating multiple data sets from different sources.
Publisher: Springer International Publishing
Date: 2018
Publisher: Springer Science and Business Media LLC
Date: 03-2021
Publisher: ACM
Date: 23-07-2002
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2021
Publisher: Oxford University Press (OUP)
Date: 24-06-2012
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2017
Publisher: Elsevier BV
Date: 10-2022
Publisher: Wiley
Date: 03-08-2021
DOI: 10.1002/WRNA.1686
Abstract: Inferring competing endogenous RNA (ceRNA) or microRNA (miRNA) sponge modules is a challenging and meaningful task for revealing ceRNA regulation mechanism at the module level. Modules in this context refer to groups of miRNA sponges which have mutual competitions and act as functional units for achieving biological processes. The recent development of computational methods based on heterogeneous data provides a novel way to discern the competitive effects of miRNA sponges on human complex diseases. This article aims to provide a comprehensive perspective of miRNA sponge module discovery methods. We first review the publicly available databases of cancer‐related miRNA sponges, as the miRNA sponges involved in human cancers contribute to the discovery of cancer‐associated modules. Then we review the existing computational methods for inferring miRNA sponge modules. Furthermore, we conduct an assessment on the performance of the module discovery methods with the pan‐cancer dataset, and the comparison study indicates that it is useful to infer biologically meaningful miRNA sponge modules by directly mapping heterogeneous data to the competitive modules. Finally, we discuss the future directions and associated challenges in developing in silico methods to infer miRNA sponge modules. This article is categorized under: RNA Interactions with Proteins and Other Molecules Small Molecule‐RNA Interactions Regulatory RNAs/RNAi/Riboswitches Regulatory RNAs
Publisher: Informa UK Limited
Date: 16-03-2018
Publisher: Springer Berlin Heidelberg
Date: 2013
Publisher: Springer Science and Business Media LLC
Date: 12-2018
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2023
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2008
DOI: 10.1109/TKDE.2008.52
Publisher: Springer International Publishing
Date: 2015
Publisher: Elsevier BV
Date: 2009
DOI: 10.1016/J.ARTMED.2008.07.008
Abstract: This paper studies a problem of efficiently discovering risk patterns in medical data. Risk patterns are defined by a statistical metric, relative risk, which has been widely used in epidemiological research. To avoid fruitless search in the complete exploration of risk patterns, we define optimal risk pattern set to exclude superfluous patterns, i.e. complicated patterns with lower relative risk than their corresponding simpler form patterns. We prove that mining optimal risk pattern sets conforms an anti-monotone property that supports an efficient mining algorithm. We propose an efficient algorithm for mining optimal risk pattern sets based on this property. We also propose a hierarchical structure to present discovered patterns for the easy perusal by domain experts. The proposed approach is compared with two well-known rule discovery methods, decision tree and association rule mining approaches on benchmark data sets and applied to a real world application. The proposed method discovers more and better quality risk patterns than a decision tree approach. The decision tree method is not designed for such applications and is inadequate for pattern exploring. The proposed method does not discover a large number of uninteresting superfluous patterns as an association mining approach does. The proposed method is more efficient than an association rule mining method. A real world case study shows that the method reveals some interesting risk patterns to medical practitioners. The proposed method is an efficient approach to explore risk patterns. It quickly identifies cohorts of patients that are vulnerable to a risk outcome from a large data set. The proposed method is useful for exploratory study on large medical data to generate and refine hypotheses. The method is also useful for designing medical surveillance systems.
Publisher: Springer Berlin Heidelberg
Date: 2010
Publisher: ACM
Date: 04-08-2023
Publisher: Public Library of Science (PLoS)
Date: 26-06-2015
Publisher: Oxford University Press (OUP)
Date: 2022
Abstract: MicroRNA (miRNA) sponges influence the capability of miRNA-mediated gene silencing by competing for shared miRNA response elements and play significant roles in many physiological and pathological processes. It has been proved that computational or dry-lab approaches are useful to guide wet-lab experiments for uncovering miRNA sponge regulation. However, all of the existing tools only allow the analysis of miRNA sponge regulation regarding a group of s les, rather than the miRNA sponge regulation unique to in idual s les. Furthermore, most existing tools do not allow parallel computing for the fast identification of miRNA sponge regulation. Here, we present an enhanced version of our R/Bioconductor package, miRspongeR 2.0. Compared with the original version introduced in 2019, this package extends the resolution of miRNA sponge regulation from the multi-s le level to the single-s le level. Moreover, it supports the identification of miRNA sponge networks using parallel computing, and the construction of s le–s le correlation networks. It also provides more computational methods to infer miRNA sponge regulation and expands the ground truth for validation. With these new features, we anticipate that miRspongeR 2.0 will further accelerate the research on miRNA sponges with higher resolution and more utilities. ackages/miRspongeR/. Supplementary data are available at Bioinformatics Advances online.
Publisher: Oxford University Press (OUP)
Date: 02-11-2016
Publisher: Association for Computing Machinery (ACM)
Date: 24-11-2015
DOI: 10.1145/2746410
Abstract: Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore, observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, and hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many datasets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large datasets. The problem is that causal relationships imply associations, but the reverse is not always true. However, we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem, whereas observational studies can be utilised to eliminate noncausal associations. In this article, we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large datasets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real-world datasets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach generally is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables—a feature that other causal discovery methods do not possess.
Publisher: Springer Berlin Heidelberg
Date: 2010
Publisher: Elsevier BV
Date: 10-2006
Publisher: ACM
Date: 26-10-2021
Publisher: Springer International Publishing
Date: 2020
Publisher: Oxford University Press (OUP)
Date: 19-02-2016
DOI: 10.1093/MNRAS/STW395
Publisher: IEEE
Date: 17-12-2022
Publisher: Elsevier BV
Date: 07-2013
Publisher: Springer Science and Business Media LLC
Date: 12-01-2019
Publisher: Elsevier BV
Date: 12-2014
DOI: 10.1016/J.JBI.2014.08.005
Abstract: Discovering the regulatory relationships between microRNAs (miRNAs) and mRNAs is an important problem that interests many biologists and medical researchers. A number of computational methods have been proposed to infer miRNA-mRNA regulatory relationships, and are mostly based on the statistical associations between miRNAs and mRNAs discovered in observational data. The miRNA-mRNA regulatory relationships identified by these methods can be both direct and indirect regulations. However, differentiating direct regulatory relationships from indirect ones is important for biologists in experimental designs. In this paper, we present a causal discovery based framework (called DirectTarget) to infer direct miRNA-mRNA causal regulatory relationships in heterogeneous data, including expression profiles of miRNAs and mRNAs, and miRNA target information. DirectTarget is applied to the Epithelial to Mesenchymal Transition (EMT) datasets. The validation by experimentally confirmed target databases suggests that the proposed method can effectively identify direct miRNA-mRNA regulatory relationships. To explore the upstream regulators of miRNA regulation, we further identify the causal feedforward patterns (CFFPs) of TF-miRNA-mRNA to provide insights into the miRNA regulation in EMT. DirectTarget has the potential to be applied to other datasets to elucidate the direct miRNA-mRNA causal regulatory relationships and to explore the regulatory patterns.
Publisher: Springer Science and Business Media LLC
Date: 03-2004
Publisher: ACM
Date: 02-12-2013
Publisher: Springer Berlin Heidelberg
Date: 2011
Publisher: IEEE
Date: 17-12-2022
Publisher: Oxford University Press (OUP)
Date: 25-08-2021
DOI: 10.1093/BIB/BBAA181
Abstract: Predicting cell locations is important since with the understanding of cell locations, we may estimate the function of cells and their integration with the spatial environment. Thus, the DREAM challenge on single-cell transcriptomics required participants to predict the locations of single cells in the Drosophila embryo using single-cell transcriptomic data. We have developed over 50 pipelines by combining different ways of preprocessing the RNA-seq data, selecting the genes, predicting the cell locations and validating predicted cell locations, resulting in the winning methods which were ranked second in sub-challenge 1, first in sub-challenge 2 and third in sub-challenge 3. In this paper, we present an R package, SCTCwhatateam, which includes all the methods we developed and the Shiny web application to facilitate the research on single-cell spatial reconstruction. All the data and the ex le use cases are available in the Supplementary data.
Publisher: Elsevier BV
Date: 02-2016
DOI: 10.1016/J.GENE.2015.11.023
Abstract: Recent studies have shown that transcription factors (TFs) and microRNAs (miRNAs), while independently regulate their downstream targets, collaborate with each other to regulate gene expression. However, their synergistic roles in protein-protein interactions (PPIs) remain mostly unknown. In this paper, we present a novel framework (called CoRePPI) for inferring TF and miRNA co-regulation of PPIs. Particularly, CoRePPI is aimed at discovering the co-regulation specific to a condition of interest, by using heterogeneous data, including miRNA and messenger RNA (mRNA) expression profiles, putative miRNA targets, TF targets and PPIs. CoRePPI firstly finds the network motifs indicating the co-regulation of PPIs by TFs and miRNAs in tumor and normal conditions separately. Then by identifying the differential motifs found in one condition but not in the other, it builds the networks consisting of TFs, miRNAs and their co-regulated PPIs specific to different conditions respectively. To validate CoRePPI, we apply it to the Pan-Cancer dataset which includes the expression profiles of 12 cancer types from TCGA. Through network topology analysis, we found that the tumor and normal CoRePPI networks are scale-free. Furthermore, the results of differential and intersected network analysis between the tumor and normal CoRePPI networks suggest that only a small fraction of the regulatory relationships between TFs and miRNAs are conserved in both conditions but they co-regulate different downstream PPIs in tumor and normal conditions and in different conditions the majority of the regulatory relationships between TFs and miRNAs are different although they may regulate the same PPIs in their respective conditions. The CoRePPI sub-networks constructed for the three types of cancers (breast cancer, lung cancer and ovarian cancer) are all scale-free, and the intersection of these CoRePPI sub-networks can be utilized as the biomarker CoRePPI sub-network of the three types of cancers. The PPI enrichment analyses of the tumor and normal CoRePPI networks suggest that the co-regulating TFs and miRNAs are significantly associated with the specific biological processes, diseases and pathways. In addition, comparing with the two non-condition-specific approaches, the tumor CoRePPI network is found to have the most enriched cancer-related PPIs. Altogether, the results uncover the combined regulatory patterns of TFs and miRNAs on the PPIs, and may provide new insights for research in cancer-associated TFs and miRNAs.
Publisher: Association for Computing Machinery (ACM)
Date: 20-02-2023
DOI: 10.1145/3532190
Abstract: Multi-label learning deals with the problem where an instance is associated with multiple labels simultaneously. Multi-label data is often of high dimensionality and has many noisy, irrelevant, and redundant features. As an important machine learning task, multi-label feature selection has received considerable attention in recent years due to its promising performance in dealing with high-dimensional multi-label data. Existing multi-label feature selection methods typically select the global features which are shared by all instances in a dataset. However, these multi-label feature selection methods may be suboptimal since they do not consider the specific characteristics of instances. In this paper, we propose a novel algorithm that integrates Global and Local Feature Selection (GLFS) to exploit both the global features and a subset of discriminative features shared only locally by a subgroup of instances in a multi-label dataset. Specifically, GLFS employs linear regression and ℓ 2,1 -norm on the regression parameters to achieve simultaneous global and local feature selection. Moreover, the proposed algorithm has an effective mechanism for utilizing label correlations to improve the feature selection. Experiments on real-world multi-label datasets show the superiority of GLFS over the state-of-the-art multi-label feature selection methods.
Publisher: Springer International Publishing
Date: 2015
Publisher: ACM
Date: 02-11-2009
Publisher: Oxford University Press (OUP)
Date: 18-10-2021
DOI: 10.1093/BIOINFORMATICS/BTAA899
Abstract: microRNAs (miRNAs) are important gene regulators and they are involved in many biological processes, including cancer progression. Therefore, correctly identifying miRNA–mRNA interactions is a crucial task. To this end, a huge number of computational methods has been developed, but they mainly use the data at one snapshot and ignore the dynamics of a biological process. The recent development of single cell data and the booming of the exploration of cell trajectories using ‘pseudotime’ concept have inspired us to develop a pseudotime-based method to infer the miRNA–mRNA relationships characterizing a biological process by taking into account the temporal aspect of the process. We have developed a novel approach, called pseudotime causality, to find the causal relationships between miRNAs and mRNAs during a biological process. We have applied the proposed method to both single cell and bulk sequencing datasets for Epithelia to Mesenchymal Transition, a key process in cancer metastasis. The evaluation results show that our method significantly outperforms existing methods in finding miRNA–mRNA interactions in both single cell and bulk data. The results suggest that utilizing the pseudotemporal information from the data helps reveal the gene regulation in a biological process much better than using the static information. R scripts and datasets can be found at github.com/AndresMCB/PTC. Supplementary data are available at Bioinformatics online.
Publisher: Elsevier BV
Date: 08-2009
DOI: 10.1016/J.JBI.2009.01.005
Abstract: The identification of miRNAs and their target mRNAs and the construction of their regulatory networks may give new insights into biological procedures. This study proposes a computational method to discover the functional miRNA-mRNA regulatory modules (FMRMs), that is, groups of miRNAs and their target mRNAs that are believed to participate cooperatively in post-transcriptional gene regulation under specific conditions. The proposed method identifies negatively regulated patterns of miRNAs and mRNAs which associate with cancer and normal conditions, respectively, in a prostate cancer data set. GO and the literature also suggest that they may relate with prostate cancer. It can potentially identify the biologically relevant chains of 'miRNA-->target gene --> condition'.
Publisher: IEEE
Date: 12-2018
Publisher: ACM
Date: 14-08-2021
Publisher: Public Library of Science (PLoS)
Date: 23-04-2020
Publisher: IEEE
Date: 04-2008
Publisher: Springer International Publishing
Date: 2017
Publisher: Oxford University Press (OUP)
Date: 09-01-2018
Publisher: Oxford University Press (OUP)
Date: 11-04-2016
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2023
Publisher: Springer Science and Business Media LLC
Date: 17-06-2020
DOI: 10.1038/S41467-020-16829-X
Abstract: Polygenic risk scores are emerging as a potentially powerful tool to predict future phenotypes of target in iduals, typically using unrelated in iduals, thereby devaluing information from relatives. Here, for 50 traits from the UK Biobank data, we show that a design of 5,000 in iduals with first-degree relatives of target in iduals can achieve a prediction accuracy similar to that of around 220,000 unrelated in iduals (mean prediction accuracy = 0.26 vs. 0.24, mean fold-change = 1.06 (95% CI: 0.99-1.13), P-value = 0.08), despite a 44-fold difference in s le size. For lifestyle traits, the prediction accuracy with 5,000 in iduals including first-degree relatives of target in iduals is significantly higher than that with 220,000 unrelated in iduals (mean prediction accuracy = 0.22 vs. 0.16, mean fold-change = 1.40 (1.17-1.62), P-value = 0.025). Our findings suggest that polygenic prediction integrating family information may help to accelerate precision health and clinical intervention.
Publisher: Oxford University Press (OUP)
Date: 08-2018
Publisher: EDP Sciences
Date: 12-2017
Publisher: Oxford University Press (OUP)
Date: 11-02-2019
DOI: 10.1093/MNRAS/STZ401
Publisher: Informa UK Limited
Date: 06-04-2021
Publisher: Springer Science and Business Media LLC
Date: 20-04-2022
DOI: 10.1007/S10618-022-00832-5
Abstract: A large number of covariates can have a negative impact on the quality of causal effect estimation since confounding adjustment becomes unreliable when the number of covariates is large relative to the number of s les. Propensity score is a common way to deal with a large covariate set, but the accuracy of propensity score estimation (normally done by logistic regression) is also challenged by the large number of covariates. In this paper, we prove that a large covariate set can be reduced to a lower dimensional representation which captures the complete information for adjustment in causal effect estimation. The theoretical result enables effective data-driven algorithms for causal effect estimation. Supported by the result, we develop an algorithm that employs a supervised kernel dimension reduction method to learn a lower dimensional representation from the original covariate space, and then utilises nearest neighbour matching in the reduced covariate space to impute the counterfactual outcomes to avoid the large sized covariate set problem. The proposed algorithm is evaluated on two semisynthetic and three real-world datasets and the results show the effectiveness of the proposed algorithm.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 02-2012
Publisher: Wiley
Date: 19-09-2018
DOI: 10.1002/CPE.4923
Publisher: Elsevier BV
Date: 07-2018
DOI: 10.1016/J.CMPB.2018.03.021
Abstract: Adverse drug reactions (ADRs) are one of the leading causes of morbidity and mortality and thus should be detected early to reduce consequences on health outcomes. Medication dispensing data are comprehensive sources of information about medicine uses that can be utilized for the signal detection of ADRs. Sequence symmetry analysis (SSA) has been employed in previous studies to detect signals of ADRs from medication dispensing data, but it has a moderate sensitivity and tends to miss some ADR signals. With successful applications in various areas, supervised machine learning (SML) methods are promising in detecting ADR signals. Gold standards of known ADRs and non- ADRs from previous studies create opportunities to take into account additional domain knowledge to improve ADR signal detection with SML. We assess the utility of SML as a signal detection tool for ADRs in medication dispensing data with the consideration of domain knowledge from DrugBank and MedDRA. We compare the best performing SML method with SSA. We model the ADR signal detection problem as a supervised machine learning problem by linking medication dispensing data with domain knowledge bases. Suspected ADR signals are extracted from the Australian Pharmaceutical Benefit Scheme (PBS) medication dispensing data from 2013 to 2016. We construct predictive features for each signal candidate based on its occurrences in medication dispensing data as well as its pharmacological properties. Pharmaceutical knowledge bases including DrugBank and MedDRA are employed to provide pharmacological features for a signal candidate. Given a gold standard of known ADRs and non-ADRs, SML learns to differentiate between known ADRs and non-ADRs based on their combined predictive features from linked sources, and then predicts whether a new case is a potential ADR signal. We evaluate the performance of six widely used SML methods with two gold standards of known ADRs and non-ADRs from previous studies. On average, gradient boosting classifier achieves the sensitivity of 77%, specificity of 81%, positive predictive value of 76%, negative predictive value of 82%, area under precision-recall curve of 81%, and area under receiver operating characteristic curve of 82%, most of which are higher than in other SML methods. In particular, gradient boosting classifier has 21% higher sensitivity than and comparable specificity with SSA. Furthermore, gradient boosting classifier detects 10% more unknown potential ADR signals than SSA. Our study demonstrates that gradient boosting classifier is a promising supervised signal detection tool for ADRs in medication dispensing data to complement SSA.
Publisher: Public Library of Science (PLoS)
Date: 24-08-2020
Publisher: Elsevier BV
Date: 04-2012
DOI: 10.1016/J.COMPBIOMED.2011.12.011
Abstract: MicroRNAs (miRNAs) play important roles in gene regulatory networks. In this paper, we propose a probabilistic topic model to infer regulatory networks of miRNAs and their target mRNAs for specific biological conditions at the post-transcriptional level, so-called functional miRNA-mRNA regulatory modules (FMRMs). The probabilistic model used in this paper can effectively capture the relationship between miRNAs and mRNAs in specific cellular conditions. Furthermore, the proposed method identifies negatively and positively correlated miRNA-mRNA pairs which are associated with epithelial, mesenchymal, and other condition in EMT (epithelial-mesenchymal transition) data set, respectively. Results on EMT data sets show that the inferred FMRMs can potentially construct the biological chain of 'miRNA→mRNA→condition' at the post-transcriptional level.
Publisher: Springer International Publishing
Date: 2020
Publisher: MDPI AG
Date: 25-06-2022
DOI: 10.3390/RS14133047
Abstract: Smoke plumes are the first things seen from space when wildfires occur. Thus, fire smoke detection is important for early fire detection. Deep Learning (DL) models have been used to detect fire smoke in satellite imagery for fire detection. However, previous DL-based research only considered lower spatial resolution sensors (e.g., Moderate-Resolution Imaging Spectroradiometer (MODIS)) and only used the visible (i.e., red, green, blue (RGB)) bands. To contribute towards solutions for early fire smoke detection, we constructed a six-band imagery dataset from Landsat 5 Thematic Mapper (TM) and Landsat 8 Operational Land Imager (OLI) with a 30-metre spatial resolution. The dataset consists of 1836 images in three classes, namely “Smoke”, “Clear”, and “Other_aerosol”. To prepare for potential on-board-of-small-satellite detection, we designed a lightweight Convolutional Neural Network (CNN) model named “Variant Input Bands for Smoke Detection (VIB_SD)”, which achieved competitive accuracy with the state-of-the-art model SAFA, with less than 2% of its number of parameters. We further investigated the impact of using additional Infra-Red (IR) bands on the accuracy of fire smoke detection with VIB_SD by training it with five different band combinations. The results demonstrated that adding the Near-Infra-Red (NIR) band improved prediction accuracy compared with only using the visible bands. Adding both Short-Wave Infra-Red (SWIR) bands can further improve the model performance compared with adding only one SWIR band. The case study showed that the model trained with multispectral bands could effectively detect fire smoke mixed with cloud over small geographic extents.
Publisher: Elsevier BV
Date: 05-2018
Publisher: IEEE
Date: 08-2017
Publisher: Springer Science and Business Media LLC
Date: 08-05-2017
Publisher: Public Library of Science (PLoS)
Date: 11-04-2016
Publisher: Oxford University Press (OUP)
Date: 17-08-2017
Publisher: Springer Science and Business Media LLC
Date: 02-08-2023
DOI: 10.1007/S10489-022-03860-2
Abstract: In personalised decision making, evidence is required to determine whether an action (treatment) is suitable for an in idual. Such evidence can be obtained by modelling treatment effect heterogeneity in subgroups. The existing interpretable modelling methods take a top-down approach to search for subgroups with heterogeneous treatment effects and they may miss the most specific and relevant context for an in idual. In this paper, we design a Treatment effect pattern (TEP) to represent treatment effect heterogeneity in data. To achieve an interpretable presentation of TEPs, we use a local causal structure around the outcome to explicitly show how those important variables are used in modelling. We also derive a formula for unbiasedly estimating the Conditional Average Causal Effect (CATE) using the local structure in our problem setting. In the discovery process, we aim at minimising heterogeneity within each subgroup represented by a pattern. We propose a bottom-up search algorithm to discover the most specific patterns fitting in idual circumstances the best for personalised decision making. Experiments show that the proposed method models treatment effect heterogeneity better than three other existing tree based methods in synthetic and real world data sets.
Publisher: IEEE
Date: 12-2021
Publisher: Elsevier BV
Date: 12-2011
Publisher: Informa UK Limited
Date: 02-10-2021
Publisher: Springer Science and Business Media LLC
Date: 11-03-2013
Publisher: Elsevier BV
Date: 04-2020
Publisher: Scientific Societies
Date: 07-2021
DOI: 10.1094/PHYTO-08-20-0330-LE
Abstract: Scientific communication is facilitated by a data-driven, scientifically sound taxonomy that considers the end-user’s needs and established successful practice. In 2013, the Fusarium community voiced near unanimous support for a concept of Fusarium that represented a clade comprising all agriculturally and clinically important Fusarium species, including the F. solani species complex (FSSC). Subsequently, this concept was challenged in 2015 by one research group who proposed iding the genus Fusarium into seven genera, including the FSSC described as members of the genus Neocosmospora, with subsequent justification in 2018 based on claims that the 2013 concept of Fusarium is polyphyletic. Here, we test this claim and provide a phylogeny based on exonic nucleotide sequences of 19 orthologous protein-coding genes that strongly support the monophyly of Fusarium including the FSSC. We reassert the practical and scientific argument in support of a genus Fusarium that includes the FSSC and several other basal lineages, consistent with the longstanding use of this name among plant pathologists, medical mycologists, quarantine officials, regulatory agencies, students, and researchers with a stake in its taxonomy. In recognition of this monophyly, 40 species described as genus Neocosmospora were recombined in genus Fusarium, and nine others were renamed Fusarium. Here the global Fusarium community voices strong support for the inclusion of the FSSC in Fusarium, as it remains the best scientific, nomenclatural, and practical taxonomic option available.
Publisher: Oxford University Press (OUP)
Date: 02-2018
DOI: 10.1093/BIB/BBY008
Abstract: It is known that noncoding RNAs (ncRNAs) cover ∼98% of the transcriptome, but do not encode proteins. Among ncRNAs, long noncoding RNAs (lncRNAs) are a large and erse class of RNA molecules, and are thought to be a gold mine of potential oncogenes, anti-oncogenes and new biomarkers. Although only a minority of lncRNAs is functionally characterized, it is clear that they are important regulators to modulate gene expression and involve in many biological functions. To reveal the functions and regulatory mechanisms of lncRNAs, it is vital to understand how lncRNAs regulate their target genes for implementing specific biological functions. In this article, we review the computational methods for inferring lncRNA–mRNA interactions and the third-party databases of storing lncRNA–mRNA regulatory relationships. We have found that the existing methods are based on statistical correlations between the gene expression levels of lncRNAs and mRNAs, and may not reveal gene regulatory relationships which are causal relationships. Moreover, these methods do not consider the modularity of lncRNA–mRNA regulatory networks, and thus, the networks identified are not module-specific. To address the above two issues, we propose a novel method, MSLCRN, to infer and analyze module-specific lncRNA–mRNA causal regulatory networks. We have applied it into glioblastoma multiforme, lung squamous cell carcinoma, ovarian cancer and prostate cancer, respectively. The experimental results show that MSLCRN, as an expression-based method, could be a useful complementary method to study lncRNA regulations.
Publisher: IEEE
Date: 09-2010
Publisher: Cold Spring Harbor Laboratory
Date: 28-12-2018
DOI: 10.1101/507749
Abstract: A microRNA (miRNA) sponge is an RNA molecule with multiple tandem miRNA response elements that can sequester miRNAs from their target mRNAs. Despite growing appreciation of the importance of miRNA sponges, our knowledge of their complex functions remains limited. Moreover, there is still a lack of miRNA sponge research tools that help researchers to quickly compare their proposed methods with other methods, apply existing methods to new datasets, or select appropriate methods for assisting in subsequent experimental design. To fill the gap, we present an R/Bioconductor package, miRsponge , for simplifying the procedure of identifying and analyzing miRNA sponge interaction networks and modules. It provides seven popular methods and an integrative method to identify miRNA sponge interactions. Moreover, it supports the validation of miRNA sponge interactions and the identification of miRNA sponge modules, as well as functional enrichment and survival analysis of miRNA sponge modules. This package enables researchers to quickly evaluate their new methods, apply existing methods to new datasets, and consequently speed up miRNA sponge research.
Publisher: Springer Science and Business Media LLC
Date: 12-2018
Publisher: Springer Science and Business Media LLC
Date: 03-2017
Publisher: IEEE
Date: 12-2012
DOI: 10.1109/ICDM.2012.36
Publisher: Oxford University Press (OUP)
Date: 28-06-2018
DOI: 10.1093/BIOINFORMATICS/BTY525
Abstract: MicroRNAs (miRNAs) are small non-coding RNAs with the length of ∼22 nucleotides. miRNAs are involved in many biological processes including cancers. Recent studies show that long non-coding RNAs (lncRNAs) are emerging as miRNA sponges, playing important roles in cancer physiology and development. Despite accumulating appreciation of the importance of lncRNAs, the study of their complex functions is still in its preliminary stage. Based on the hypothesis of competing endogenous RNAs (ceRNAs), several computational methods have been proposed for investigating the competitive relationships between lncRNAs and miRNA target messenger RNAs (mRNAs). However, when the mRNAs are released from the control of miRNAs, it remains largely unknown as to how the sponge lncRNAs influence the expression levels of the endogenous miRNA targets. We propose a novel method to construct lncRNA related miRNA sponge regulatory networks (LncmiRSRNs) by integrating matched lncRNA and mRNA expression profiles with clinical information and putative miRNA-target interactions. Using the method, we have constructed the LncmiRSRNs for four human cancers (glioblastoma multiforme, lung cancer, ovarian cancer and prostate cancer). Based on the networks, we discover that after being released from miRNA control, the target mRNAs are normally up-regulated by the sponge lncRNAs, and only a fraction of sponge lncRNA-mRNA regulatory relationships and hub lncRNAs are shared by the four cancers. Moreover, most sponge lncRNA-mRNA regulatory relationships show a rewired mode between different cancers, and a minority of sponge lncRNA-mRNA regulatory relationships conserved (appearing) in different cancers may act as a common pivot across cancers. Besides, differential and conserved hub lncRNAs may act as potential cancer drivers to influence the cancerous state in cancers. Functional enrichment and survival analysis indicate that the identified differential and conserved LncmiRSRN network modules work as functional units in biological processes, and can distinguish metastasis risks of cancers. Our analysis demonstrates the potential of integrating expression profiles, clinical information and miRNA-target interactions for investigating lncRNA regulatory mechanism. LncmiRSRN is freely available (hangjunpeng411/LncmiRSRN). Supplementary data are available at Bioinformatics online.
Publisher: Springer International Publishing
Date: 2015
Publisher: Elsevier BV
Date: 05-2018
Publisher: Springer Nature Singapore
Date: 2023
Publisher: Elsevier BV
Date: 12-2020
Publisher: ACM
Date: 24-10-2011
Publisher: Elsevier BV
Date: 12-2013
Publisher: IOP Publishing
Date: 02-07-2019
Abstract: This paper presents a survey of microwave front-end receivers installed at radio telescopes throughout the world. This unprecedented analysis was conducted as part of a review of front-end developments for Italian radio telescopes, initiated by the Italian National Institute for Astrophysics in 2016. Fifteen international radio telescopes have been selected to be representative of the instrumentation used for radio astronomical observations in the frequency domain from 300 MHz to 116 GHz. A comprehensive description of the existing receivers is presented and their characteristics are compared and discussed. The observing performances of the complete receiving chains are also presented. An overview of ongoing developments illustrates and anticipates future trends in front-end projects to meet the most ambitious scientific research goals.
Publisher: Springer Science and Business Media LLC
Date: 19-01-2014
Publisher: Springer Nature Switzerland
Date: 2023
Publisher: Frontiers Media SA
Date: 08-06-2018
Publisher: ACM
Date: 18-02-2013
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 06-2020
Publisher: Springer Science and Business Media LLC
Date: 08-01-2009
Publisher: Springer Berlin Heidelberg
Date: 2009
Publisher: Oxford University Press (OUP)
Date: 22-11-2012
DOI: 10.1093/BIB/BBS075
Publisher: Springer Berlin Heidelberg
Date: 2010
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 06-2023
Publisher: Elsevier BV
Date: 09-2014
Publisher: Association for Computing Machinery (ACM)
Date: 19-09-2023
DOI: 10.1145/3624479
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 06-2010
Publisher: Springer Science and Business Media LLC
Date: 12-2009
Publisher: Springer Berlin Heidelberg
Date: 2012
Publisher: Springer Berlin Heidelberg
Date: 2008
Publisher: International Academy Publishing (IAP)
Date: 03-06-2011
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2023
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2023
Publisher: Elsevier BV
Date: 06-2018
Publisher: ACM
Date: 07-09-2022
Publisher: IEEE
Date: 07-2008
Publisher: Cold Spring Harbor Laboratory
Date: 28-05-2019
DOI: 10.1101/652180
Abstract: Studying multiple microRNAs (miRNAs) synergism in gene regulation could help to understand the regulatory mechanisms of complicated human diseases caused by miRNAs. Several existing methods have been presented to infer miRNA synergism. Most of the current methods assume that miRNAs with shared targets at the sequence level are working synergistically. However, it is unclear if miRNAs with shared targets are working in concert to regulate the targets or they in idually regulate the targets at different time points or different biological processes. A standard method to test the synergistic activities is to knock-down multiple miRNAs at the same time and measure the changes in the target genes. However, this approach may not be practical as we would have too many sets of miRNAs to test. In this paper, we present a novel framework called miRsyn for inferring miRNA synergism by using a causal inference method that mimics the multiple-intervention experiments, e.g. knocking-down multiple miRNAs, with observational data. Our results show that several miRNA-miRNA pairs that have shared targets at the sequence level are not working synergistically at the expression level. Moreover, the identified miRNA synergistic network is small-world and biologically meaningful, and a number of miRNA synergistic modules are significantly enriched in breast cancer. Our further analyses also reveal that most of synergistic miRNA-miRNA pairs show the same expression patterns. The comparison results indicate that the proposed multiple-intervention causal inference method performs better than the single-intervention causal inference method in identifying miRNA synergistic network. Taken together, the results imply that miRsyn is a promising framework for identifying miRNA synergism, and it could enhance the understanding of miRNA synergism in breast cancer.
Publisher: Elsevier BV
Date: 2016
Publisher: Elsevier BV
Date: 09-2018
DOI: 10.1016/J.JBI.2018.07.013
Abstract: Drug safety issues such as Adverse Drug Events (ADEs) can cause serious consequences for the public. The clinical trials that are undertaken to assess medicine efficacy and safety prior to marketing, generally, may provide sufficient s les for discovering common ADEs. However, more s les are needed to detect infrequent and rare events. Additionally, clinical trials may not include all subgroups of patients. For these reasons, post-marketing surveillance of medicines is necessary for identifying drug safety issues. Most regulatory agencies use the Spontaneous Reporting Systems to identify associations between medicines and suspected ADEs. Data mining with effective analytical frameworks and large-scale medical data is potentially an alternative method to discover and monitor ADEs. In the present paper, we aim to detect potential ADEs from prescription data by discovering ADE associated prescription sequences. In an ADE associated prescription sequence 〈D
Publisher: Springer Science and Business Media LLC
Date: 17-01-2020
Publisher: Oxford University Press (OUP)
Date: 12-07-2014
DOI: 10.1093/BIB/BBU023
Abstract: microRNAs (miRNAs) are important gene regulators. They control a wide range of biological processes and are involved in several types of cancers. Thus, exploring miRNA functions is important for diagnostics and therapeutics. To date, there are few feasible experimental techniques for discovering miRNA regulatory mechanisms. Alternatively, predictions of miRNA-mRNA regulatory relationships by computational methods have increasingly achieved promising results. Computational approaches are proving their ability as effective tools in reducing the number of biological experiments that must be conducted and to assist with the design of the experiments. In this review, we categorize and review different computational approaches to identify miRNA activities and functions, including the co-regulation of miRNAs and transcription factors. Our main focuses are on the recent approaches that use multiple data types for exploring miRNA functions. We discuss the remaining challenges in the evaluation and selection of models based on the results from a case study. Finally, we analyse the remaining challenges of each computational approach and suggest some future research directions.
Publisher: Springer Nature Switzerland
Date: 2023
Publisher: IEEE
Date: 12-2008
Publisher: Springer Science and Business Media LLC
Date: 02-12-2021
DOI: 10.1186/S12859-021-04498-6
Abstract: Existing computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to in idual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation. In this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to combine single-cell miRNA-mRNA co-sequencing data and putative miRNA-mRNA binding information to identify miRNA regulatory networks at the resolution of in idual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks for understanding miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. The comparison results indicate that CSmiR is effective in predicting cell-specific miRNA targets. Finally, through exploring cell–cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells and helps to understand cell–cell crosstalk. To the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.
Publisher: Oxford University Press (OUP)
Date: 19-09-2022
DOI: 10.1093/BFGP/ELAC030
Abstract: The traditional way for discovering genes which drive cancer (namely cancer drivers) neglects the dynamic information of cancer development, even though it is well known that cancer progresses dynamically. To enhance cancer driver discovery, we expand cancer driver concept to dynamic cancer driver as a gene driving one or more bio-pathological transitions during cancer progression. Our method refers to the fact that cancer should not be considered as a single process but a compendium of altered biological processes causing the disease to develop over time. Reciprocally, different drivers of cancer can potentially be discovered by analysing different bio-pathological pathways. We propose a novel approach for causal inference of genes driving one or more core processes during cancer development (i.e. dynamic cancer driver). We use the concept of pseudotime for inferring the latent progression of s les along a biological transition during cancer and identifying a critical event when such a process is significantly deviated from normal to carcinogenic. We infer driver genes by assessing the causal effect they have on the process after such a critical event. We have applied our method to single-cell and bulk sequencing datasets of breast cancer. The evaluation results show that our method outperforms well-recognized cancer driver inference methods. These results suggest that including information of the underlying dynamics of cancer improves the inference process (in comparison with using static data), and allows us to discover different sets of driver genes from different processes in cancer. R scripts and datasets can be found at github.com/AndresMCB/DynamicCancerDriver
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2023
Publisher: Elsevier BV
Date: 11-2022
Publisher: Springer Berlin Heidelberg
Date: 2013
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2020
Publisher: Springer Science and Business Media LLC
Date: 30-08-2019
Publisher: Elsevier BV
Date: 08-2014
Publisher: Oxford University Press (OUP)
Date: 17-10-2010
DOI: 10.1093/BIOINFORMATICS/BTQ576
Abstract: Motivation: MicroRNAs (miRNAs) are small non-coding RNAs that cause mRNA degradation and translational inhibition. They are important regulators of development and cellular homeostasis through their control of erse processes. Recently, great efforts have been made to elucidate their regulatory mechanism, but the functions of most miRNAs and their precise regulatory mechanisms remain elusive. With more and more matched expression profiles of miRNAs and mRNAs having been made available, it is of great interest to utilize both expression profiles to discover the functional regulatory networks of miRNAs and their target mRNAs for potential biological processes that they may participate in. Results: We present a probabilistic graphical model to discover functional miRNA regulatory modules at potential biological levels by integrating heterogeneous datasets, including expression profiles of miRNAs and mRNAs, with or without the prior target binding information. We applied this model to a mouse mammary dataset. It effectively captured several biological process specific modules involving miRNAs and their target mRNAs. Furthermore, without using prior target binding information, the identified miRNAs and mRNAs in each module show a large proportion of overlap with predicted miRNA target relationships, suggesting that expression profiles are crucial for both target identification and discovery of regulatory modules. Contact: bing.liu@unisa.edu.au jiuyong.li@unisa.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Springer Berlin Heidelberg
Date: 2013
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2010
Publisher: Springer New York
Date: 2010
DOI: 10.1007/978-1-4419-5913-3_17
Abstract: Apart from the dimensionality problem, the uncertainty of Microarray data quality is another major challenge of Microarray classification. Microarray data contain various levels of noise and quite often high levels of noise, and these data lead to unreliable and low accuracy analysis as well as high dimensionality problem. In this paper, we propose a new Microarray data classification method, based on ersified multiple trees. The new method contains features that (1) make most use of the information from the abundant genes in the Microarray data and (2) use a unique ersity measurement in the ensemble decision committee. The experimental results show that the proposed classification method (DMDT) and the well-known method (CS4), which ersifies trees by using distinct tree roots, are more accurate on average than other well-known ensemble methods, including Bagging, Boosting, and Random Forests. The experiments also indicate that using ersity measurement of DMDT improves the classification accuracy of ensemble classification on Microarray data.
Publisher: Oxford University Press (OUP)
Date: 07-12-2018
Publisher: Springer International Publishing
Date: 2015
Publisher: Springer Berlin Heidelberg
Date: 2010
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2019
Publisher: Oxford University Press (OUP)
Date: 30-01-2013
DOI: 10.1093/BIOINFORMATICS/BTT048
Abstract: Motivation: microRNAs (miRNAs) are known to play an essential role in the post-transcriptional gene regulation in plants and animals. Currently, several computational approaches have been developed with a shared aim to elucidate miRNA–mRNA regulatory relationships. Although these existing computational methods discover the statistical relationships, such as correlations and associations between miRNAs and mRNAs at data level, such statistical relationships are not necessarily the real causal regulatory relationships that would ultimately provide useful insights into the causes of gene regulations. The standard method for determining causal relationships is randomized controlled perturbation experiments. In practice, however, such experiments are expensive and time consuming. Our motivation for this study is to discover the miRNA–mRNA causal regulatory relationships from observational data. Results: We present a causality discovery-based method to uncover the causal regulatory relationship between miRNAs and mRNAs, using expression profiles of miRNAs and mRNAs without taking into consideration the previous target information. We apply this method to the epithelial-to-mesenchymal transition (EMT) datasets and validate the computational discoveries by a controlled biological experiment for the miR-200 family. A significant portion of the regulatory relationships discovered in data is consistent with those identified by experiments. In addition, the top genes that are causally regulated by miRNAs are highly relevant to the biological conditions of the datasets. The results indicate that the causal discovery method effectively discovers miRNA regulatory relationships in data. Although computational predictions may not completely replace intervention experiments, the accurate and reliable discoveries in data are cost effective for the design of miRNA experiments and the understanding of miRNA–mRNA regulatory relationships. Availability: The R scripts are in the Supplementary material. Contact: thuc_duy.le@mymail.unisa.edu.au or jiuyong.li@unisa.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Springer Science and Business Media LLC
Date: 09-07-2014
Publisher: Oxford University Press (OUP)
Date: 12-2020
DOI: 10.1093/BIOINFORMATICS/BTAA797
Abstract: Identifying cancer driver genes is a key task in cancer informatics. Most existing methods are focused on in idual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesize that there are driver gene groups that work in concert to regulate cancer, and we develop a novel computational method to detect those driver gene groups. We develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (i) constructing the gene network, (ii) discovering critical nodes of the constructed network and (iii) identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup. DriverGroup is available at vvhoang/DriverGroup Supplementary data are available at Bioinformatics online.
Publisher: Oxford University Press (OUP)
Date: 23-04-2015
DOI: 10.1093/JAMIA/OCV004
Abstract: Objective The Health Insurance Portability and Accountability Act Privacy Rule enables healthcare organizations to share de-identified data via two routes. They can either 1) show re-identification risk is small (e.g., via a formal model, such as k-anonymity) with respect to an anticipated recipient or 2) apply a rule-based policy (i.e., Safe Harbor) that enumerates attributes to be altered (e.g., dates to years). The latter is often invoked because it is interpretable, but it fails to tailor protections to the capabilities of the recipient. The paper shows rule-based policies can be mapped to a utility (U) and re-identification risk (R) space, which can be searched for a collection, or frontier, of policies that systematically trade off between these goals. Methods We extend an algorithm to efficiently compose an R-U frontier using a lattice of policy options. Risk is proportional to the number of patients to which a record corresponds, while utility is proportional to similarity of the original and de-identified distribution. We allow our method to search 20 000 rule-based policies (out of 2700) and compare the resulting frontier with k-anonymous solutions and Safe Harbor using the demographics of 10 U.S. states. Results The results demonstrate the rule-based frontier 1) consists, on average, of 5000 policies, 2% of which enable better utility with less risk than Safe Harbor and 2) the policies cover a broader spectrum of utility and risk than k-anonymity frontiers. Conclusions R-U frontiers of de-identification policies can be discovered efficiently, allowing healthcare organizations to tailor protections to anticipated needs and trustworthiness of recipients.
Publisher: Oxford University Press (OUP)
Date: 17-03-2011
Publisher: IEEE
Date: 12-2017
Publisher: Springer International Publishing
Date: 2018
Publisher: Oxford University Press (OUP)
Date: 23-07-2014
DOI: 10.1093/BIOINFORMATICS/BTU489
Abstract: Motivation: MicroRNAs (miRNAs) play crucial roles in complex cellular networks by binding to the messenger RNAs (mRNAs) of protein coding genes. It has been found that miRNA regulation is often condition-specific. A number of computational approaches have been developed to identify miRNA activity specific to a condition of interest using gene expression data. However, most of the methods only use the data in a single condition, and thus, the activity discovered may not be unique to the condition of interest. Additionally, these methods are based on statistical associations between the gene expression levels of miRNAs and mRNAs, so they may not be able to reveal real gene regulatory relationships, which are causal relationships. Results: We propose a novel method to infer condition-specific miRNA activity by considering (i) the difference between the regulatory behavior that an miRNA has in the condition of interest and its behavior in the other conditions (ii) the causal semantics of miRNA–mRNA relationships. The method is applied to the epithelial–mesenchymal transition (EMT) and multi-class cancer (MCC) datasets. The validation by the results of transfection experiments shows that our approach is effective in discovering significant miRNA–mRNA interactions. Functional and pathway analysis and literature validation indicate that the identified active miRNAs are closely associated with the specific biological processes, diseases and pathways. More detailed analysis of the activity of the active miRNAs implies that some active miRNAs show different regulation types in different conditions, but some have the same regulation types and their activity only differs in different conditions in the strengths of regulation. Availability and implementation: The R and Matlab scripts are in the Supplementary materials . Contact: jiuyong.li@unisa.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Association for Computing Machinery (ACM)
Date: 09-03-2022
DOI: 10.1145/3508071
Abstract: Learning partial Bayesian network (BN) structure is an interesting and challenging problem. In this challenge, it is computationally expensive to use global BN structure learning algorithms, while only one part of a BN structure is interesting, local BN structure learning algorithms are not a favourable solution either due to the issue of false edge orientation. To address the problem, this article first presents a detailed analysis of the false edge orientation issue with local BN structure learning algorithms and then proposes PSL, an efficient and accurate P artial BN S tructure L earning (PSL) algorithm. Specifically, PSL ides V-structures in a Markov blanket (MB) into two types: Type-C V-structures and Type-NC V-structures, then it starts from the given node of interest and recursively finds both types of V-structures in the MB of the current node until all edges in the partial BN structure are oriented. To further improve the efficiency of PSL, the PSL-FS algorithm is designed by incorporating F eature S election (FS) into PSL. Extensive experiments with six benchmark BNs validate the efficiency and accuracy of the proposed algorithms.
Publisher: ACM
Date: 07-12-2017
Publisher: IEEE
Date: 08-2010
Publisher: Springer International Publishing
Date: 2014
Publisher: Springer Science and Business Media LLC
Date: 26-11-2010
Publisher: Public Library of Science (PLoS)
Date: 04-2016
Publisher: Oxford University Press (OUP)
Date: 28-04-2022
DOI: 10.1093/BFGP/ELAC006
Abstract: Preecl sia is a pregnancy-specific disease that can have serious effects on the health of both mothers and their offspring. Predicting which women will develop preecl sia in early pregnancy with high accuracy will allow for improved management. The clinical symptoms of preecl sia are well recognized, however, the precise molecular mechanisms leading to the disorder are poorly understood. This is compounded by the heterogeneous nature of preecl sia onset, timing and severity. Indeed a multitude of poorly defined causes including genetic components implicates etiologic factors, such as immune maladaptation, placental ischemia and increased oxidative stress. Large datasets generated by microarray and next-generation sequencing have enabled the comprehensive study of preecl sia at the molecular level. However, computational approaches to simultaneously analyze the preecl sia transcriptomic and network data and identify clinically relevant information are currently limited. In this paper, we proposed a control theory method to identify potential preecl sia-associated genes based on both transcriptomic and network data. First, we built a preecl sia gene regulatory network and analyzed its controllability. We then defined two types of critical preecl sia-associated genes that play important roles in the constructed preecl sia-specific network. Benchmarking against differential expression, betweenness centrality and hub analysis we demonstrated that the proposed method may offer novel insights compared with other standard approaches. Next, we investigated subtype specific genes for early and late onset preecl sia. This control theory approach could contribute to a further understanding of the molecular mechanisms contributing to preecl sia.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2016
Publisher: Springer International Publishing
Date: 2022
Publisher: IEEE
Date: 12-2013
Publisher: Springer Berlin Heidelberg
Date: 2008
Publisher: American Scientific Publishers
Date: 07-2011
Publisher: Springer Science and Business Media LLC
Date: 03-2017
Publisher: Association for Computing Machinery (ACM)
Date: 18-04-2021
DOI: 10.1145/3436891
Abstract: In this article, we aim to develop a unified view of causal and non-causal feature selection methods. The unified view will fill in the gap in the research of the relation between the two types of methods. Based on the Bayesian network framework and information theory, we first show that causal and non-causal feature selection methods share the same objective. That is to find the Markov blanket of a class attribute, the theoretically optimal feature set for classification. We then examine the assumptions made by causal and non-causal feature selection methods when searching for the optimal feature set, and unify the assumptions by mapping them to the restrictions on the structure of the Bayesian network model of the studied problem. We further analyze in detail how the structural assumptions lead to the different levels of approximations employed by the methods in their search, which then result in the approximations in the feature sets found by the methods with respect to the optimal feature set. With the unified view, we can interpret the output of non-causal methods from a causal perspective and derive the error bounds of both types of methods. Finally, we present practical understanding of the relation between causal and non-causal methods using extensive experiments with synthetic data and various types of real-world data.
Publisher: Wiley
Date: 12-04-2022
DOI: 10.1111/BJET.13217
Abstract: With the widespread use of learning analytics (LA), ethical concerns about fairness have been raised. Research shows that LA models may be biased against students of certain demographic subgroups. Although fairness has gained significant attention in the broader machine learning (ML) community in the last decade, it is only recently that attention has been paid to fairness in LA. Furthermore, the decision on which unfairness mitigation algorithm or metric to use in a particular context remains largely unknown. On this premise, we performed a comparative evaluation of some selected unfairness mitigation algorithms regarded in the fair ML community to have shown promising results. Using a 3‐year program dropout data from an Australian university, we comparatively evaluated how the unfairness mitigation algorithms contribute to ethical LA by testing for some hypotheses across fairness and performance metrics. Interestingly, our results show how data bias does not always necessarily result in predictive bias. Perhaps not surprisingly, our test for fairness‐utility tradeoff shows how ensuring fairness does not always lead to drop in utility. Indeed, our results show that ensuring fairness might lead to enhanced utility under specific circumstances. Our findings may to some extent, guide fairness algorithm and metric selection for a given context. What is already known about this topic LA is increasingly being used to leverage actionable insights about students and drive student success. LA models have been found to make discriminatory decisions against certain student demographic subgroups—therefore, raising ethical concerns. Fairness in education is nascent. Only a few works have examined fairness in LA and consequently followed up with ensuring fair LA models. What this paper adds A juxtaposition of unfairness mitigation algorithms across the entire LA pipeline showing how they compare and how each of them contributes to fair LA. Ensuring ethical LA does not always lead to a dip in performance. Sometimes, it actually improves performance as well. Fairness in LA has only focused on some form of outcome equality, however equality of outcome may be possible only when the playing field is levelled. Implications for practice and/or policy Based on desired notion of fairness and which segment of the LA pipeline is accessible, a fairness‐minded decision maker may be able to decide which algorithm to use in order to achieve their ethical goals. LA practitioners can carefully aim for more ethical LA models without trading significant utility by selecting algorithms that find the right balance between the two objectives. Fairness enhancing technologies should be cautiously used as guides—not final decision makers. Human domain experts must be kept in the loop to handle the dynamics of transcending fair LA beyond equality to equitable LA.
Publisher: Springer International Publishing
Date: 2018
Publisher: Springer Berlin Heidelberg
Date: 2012
Publisher: Association for Computing Machinery (ACM)
Date: 28-09-2020
DOI: 10.1145/3409382
Abstract: Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this article, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at uiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world datasets. Finally, we discuss some challenging problems to be tackled in future research.
Publisher: IEEE
Date: 10-12-2020
Publisher: Elsevier BV
Date: 11-2018
Publisher: Springer International Publishing
Date: 2016
Publisher: Association for Computing Machinery (ACM)
Date: 30-09-2019
DOI: 10.1145/3359995
Publisher: Association for the Advancement of Artificial Intelligence (AAAI)
Date: 26-06-2023
Abstract: Estimating direct and indirect causal effects from observational data is crucial to understanding the causal mechanisms and predicting the behaviour under different interventions. Causal mediation analysis is a method that is often used to reveal direct and indirect effects. Deep learning shows promise in mediation analysis, but the current methods only assume latent confounders that affect treatment, mediator and outcome simultaneously, and fail to identify different types of latent confounders (e.g., confounders that only affect the mediator or outcome). Furthermore, current methods are based on the sequential ignorability assumption, which is not feasible for dealing with multiple types of latent confounders. This work aims to circumvent the sequential ignorability assumption and applies the piecemeal deconfounding assumption as an alternative. We propose the Disentangled Mediation Analysis Variational AutoEncoder (DMAVAE), which disentangles the representations of latent confounders into three types to accurately estimate the natural direct effect, natural indirect effect and total effect. Experimental results show that the proposed method outperforms existing methods and has strong generalisation ability. We further apply the method to a real-world dataset to show its potential application.
Publisher: Springer International Publishing
Date: 2014
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 11-2011
Publisher: Oxford University Press (OUP)
Date: 03-05-2018
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 12-2018
Publisher: Elsevier BV
Date: 07-2016
DOI: 10.1016/J.ARTMED.2016.06.002
Abstract: Prescribing cascade (PC) occurs when an adverse drug reaction (ADR) is misinterpreted as a new medical condition, leading to further prescriptions for treatment. Additional prescriptions, however, may worsen the existing condition or introduce additional adverse effects (AEs). Timely detection and prevention of detrimental PCs is essential as drug AEs are among the leading causes of hospitalization and deaths. Identifying detrimental PCs would enable warnings and contraindications to be disseminated and assist the detection of unknown drug AEs. Nonetheless, the detection is difficult and has been limited to case reports or case assessment using administrative health claims data. Social media is a promising source for detecting signals of detrimental PCs due to the public availability of many discussions regarding treatments and drug AEs. In this paper, we investigate the feasibility of detecting detrimental PCs from social media. The detection, however, is challenging due to the data uncertainty and data rarity in social media. We propose a framework to mine sequences of drugs and AEs that signal detrimental PCs, taking into account the data uncertainty and data rarity. We conduct experiments on two real-world datasets collected from Twitter and Patient health forum. Our framework achieves encouraging results in the validation against known detrimental PCs (F1=78% for Twitter and 68% for Patient) and the detection of unknown potential detrimental PCs (Precision@50=72% and NDCG@50=95% for Twitter, Precision@50=86% and NDCG@50=98% for Patient). In addition, the framework is efficient and scalable to large datasets. Our study demonstrates the feasibility of generating hypotheses of detrimental PCs from social media to reduce pharmacists' guesswork.
Publisher: Oxford University Press (OUP)
Date: 12-06-2017
DOI: 10.1093/BIOINFORMATICS/BTX378
Abstract: Identifying molecular cancer subtypes from multi-omics data is an important step in the personalized medicine. We introduce CancerSubtypes, an R package for identifying cancer subtypes using multi-omics data, including gene expression, miRNA expression and DNA methylation data. CancerSubtypes integrates four main computational methods which are highly cited for cancer subtype identification and provides a standardized framework for data pre-processing, feature selection, and result follow-up analyses, including results computing, biology validation and visualization. The input and output of each step in the framework are packaged in the same data format, making it convenience to compare different methods. The package is useful for inferring cancer subtypes from an input genomic dataset, comparing the predictions from different well-known methods and testing new subtype discovery methods, as shown with different application scenarios in the Supplementary Material. The package is implemented in R and available under GPL-2 license from the Bioconductor website (ackages/CancerSubtypes/). Supplementary data are available at Bioinformatics online.
Publisher: Elsevier BV
Date: 07-2011
Publisher: Springer International Publishing
Date: 2015
Publisher: Association for Computing Machinery (ACM)
Date: 10-08-2023
DOI: 10.1145/3604560
Abstract: In multi-label learning, each instance is associated with multiple labels simultaneously. Multi-label data often have noisy, irrelevant, and redundant features of high dimensionality. Multi-label feature selection has received considerable attention as an effective means for dealing with high-dimensional multi-label data. Many multi-label feature selection methods exploit label correlations to help select features. However, finding label correlations and selecting features in existing multi-label feature selection methods are often two separate processes, the existence of noises and outliers in training data makes the label correlations exploited from label space less reliable. Therefore, the learned label correlations may mislead the feature selection process and result in the selection of less informative features. This article proposes a novel algorithm named ROAD, i.e., multi-label featuRe selectiOn via ADaptive label correlation estimation. ROAD jointly performs adaptive label correlation exploration and feature selection with alternating optimization to obtain reliable estimation of label correlations, which can more effectively reveal the intrinsic manifold structure among labels and lead to the selection of a more proper feature subset. Comprehensive experiments on several frequently used datasets validate the superiority of ROAD against the state-of-the-art multi-label feature selection algorithms.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 02-2017
Publisher: Oxford University Press (OUP)
Date: 24-03-2017
DOI: 10.1093/BIOINFORMATICS/BTX174
Abstract: Cancer is not a single disease and involves different subtypes characterized by different sets of molecules. Patients with different subtypes of cancer often react heterogeneously towards the same treatment. Currently, clinical diagnoses rather than molecular profiles are used to determine the most suitable treatment. A molecular level approach will allow a more precise and informed way for making treatment decisions, leading to a better survival chance and less suffering of patients. Although many computational methods have been proposed to identify cancer subtypes at molecular level, to the best of our knowledge none of them are designed to discover subtypes with heterogeneous treatment responses. In this article we propose the Survival Causal Tree (SCT) method. SCT is designed to discover patient subgroups with heterogeneous treatment effects from censored observational data. Results on TCGA breast invasive carcinoma and glioma datasets have shown that for each subtype identified by SCT, the patients treated with radiotherapy exhibit significantly different relapse free survival pattern when compared to patients without the treatment. With the capability to identify cancer subtypes with heterogeneous treatment responses, SCT is useful in helping to choose the most suitable treatment for in idual patients. Data and code are available at github.com/WeijiaZhang24/SurvivalCausalTree. Supplementary data are available at Bioinformatics online.
Publisher: ACM
Date: 20-07-2023
Publisher: Elsevier BV
Date: 11-2016
Publisher: IEEE
Date: 06-2010
Publisher: Springer Science and Business Media LLC
Date: 10-05-2019
Publisher: Royal Society of Chemistry (RSC)
Date: 2016
DOI: 10.1039/C5MB00562K
Abstract: We present a causality based framework called mirSRN to infer miRNA synergism in human molecular systems.
Publisher: IEEE
Date: 08-2018
Publisher: IEEE
Date: 12-2014
Publisher: ACM
Date: 14-08-2021
Publisher: Cold Spring Harbor Laboratory
Date: 06-06-2018
DOI: 10.1101/340638
Abstract: microRNAs (miRNAs) regulate gene expression at the post-transcriptional level and they play an important role in various biological processes in the human body. Therefore, identifying their regulation mechanisms is essential for the diagnostics and therapeutics for a wide range of diseases. There have been a large number of researches which use gene expression profiles to resolve this problem. However, the current methods have their own limitations. Some of them only identify the correlation of miRNA and mRNA expression levels instead of the causal or regulatory relationships while others infer the causality but with a high computational complexity. To overcome these issues, in this study, we propose a method to identify miRNA-mRNA regulatory relationships in breast cancer using the invariant causal prediction. The key idea of invariant causal prediction is that the cause miRNAs of their target mRNAs are the ones which have persistent causal relationships with the target mRNAs across different environments. In this research, we aim to find miRNA targets which are consistent across different breast cancer subtypes. Thus, first of all, we apply the Pam50 method to categorise BRCA s les into different ‘‘environment” groups based on different cancer subtypes. Then we use the invariant causal prediction method to find miRNA-mRNA regulatory relationships across subtypes. We validate the results with the miRNA-transfected experimental data and the results show that our method outperforms the state-of-the-art methods. In addition, we also integrate this new method with the Pearson correlation analysis method and Lasso in an ensemble method to take the advantages of these methods. We then validate the results of the ensemble method with the experimentally confirmed data and the ensemble method shows the best performance, even comparing to the proposed causal method. Functional enrichment analyses show that miRNAs in the regulatory relationship predicated by the proposed causal method tend to synergistically regulate target genes, indicating the usefulness of these methods, and the identified miRNA targets could be used in the design of wet-lab experiments to discover the causes of breast cancer. Cancer is a disease of cells in human body and it causes a high rate of deaths world wide. There has been evidence that non-coding RNAs are key players in the development and progression of cancer. Among the different types of non-coding RNAs, miRNAs, which are short non-coding RNAs, regulate gene expression and play an important role in different biological processes as well as various cancer types. To design better diagnostic and therapeutic plans for cancer patients, we need to know the roles of miRNAs in cancer initialisation and development, and their regulation mechanisms in the human body. In this study, we propose algorithms to identify miRNA-mRNA regulatory relationships in breast cancer. Comparing our methods with existing methods in predicting miRNA targets, our methods show a better performance. The estimated miRNA targets from our methods could be a potential source for further wet-lab experiments to discover the causes of breast cancer.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2023
Publisher: Association for Computing Machinery (ACM)
Date: 04-10-2021
DOI: 10.1145/3466818
Abstract: A central question in many fields of scientific research is to determine how an outcome is affected by an action, i.e., to estimate the causal effect or treatment effect of an action. In recent years, in areas such as personalised healthcare, sociology, and online marketing, a need has emerged to estimate heterogeneous treatment effects with respect to in iduals of different characteristics. To meet this need, two major approaches have been taken: treatment effect heterogeneity modelling and uplifting modelling. Researchers and practitioners in different communities have developed algorithms based on these approaches to estimate the heterogeneous treatment effects. In this article, we present a unified view of these two seemingly disconnected yet closely related approaches under the potential outcome framework. We provide a structured survey of existing methods following either of the two approaches, emphasising their inherent connections and using unified notation to facilitate comparisons. We also review the main applications of the surveyed methods in personalised marketing, personalised medicine, and sociology. Finally, we summarise and discuss the available software packages and source codes in terms of their coverage of different methods and applicability to different datasets, and we provide general guidelines for method selection.
Publisher: IEEE
Date: 07-2008
Publisher: Ivyspring International Publisher
Date: 2021
DOI: 10.7150/THNO.52670
Publisher: IEEE
Date: 12-2013
DOI: 10.1109/ICDMW.2013.7
Publisher: Springer Science and Business Media LLC
Date: 21-06-2018
Publisher: Elsevier BV
Date: 09-2014
Publisher: ACM
Date: 21-08-2005
Publisher: Elsevier BV
Date: 12-2018
Publisher: Springer International Publishing
Date: 2020
Publisher: Springer International Publishing
Date: 2022
Publisher: Association for Computing Machinery (ACM)
Date: 09-01-2016
DOI: 10.1145/2840720
Publisher: Springer Berlin Heidelberg
Date: 2013
Publisher: Springer International Publishing
Date: 2014
Start Date: 2015
End Date: 2018
Funder: Cooperative Research Centres, Australian Government Department of Industry
View Funded ActivityStart Date: 2017
End Date: 2019
Funder: Australian Research Council
View Funded ActivityStart Date: 2023
End Date: 12-2025
Amount: $420,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2017
End Date: 06-2021
Amount: $381,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 2011
End Date: 12-2014
Amount: $300,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 12-2020
End Date: 12-2024
Amount: $360,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2007
End Date: 06-2012
Amount: $165,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2005
End Date: 03-2010
Amount: $112,514.00
Funder: Australian Research Council
View Funded ActivityStart Date: 03-2013
End Date: 11-2016
Amount: $350,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2014
End Date: 12-2017
Amount: $270,000.00
Funder: Australian Research Council
View Funded Activity