Privacy-preserving cloud data mining-as-a-service. This project aims to explore practical privacy-preserving solutions for cloud data mining-as-a-service based on the Intel Software Guard Extensions (SGX) technology. The research addresses privacy concerns of users when outsourcing data mining needs to the cloud. These concerns have increased as more businesses evaluate data mining-as-an outsourced service due to lack of expertise or computation resources. The expected outcomes from the research ....Privacy-preserving cloud data mining-as-a-service. This project aims to explore practical privacy-preserving solutions for cloud data mining-as-a-service based on the Intel Software Guard Extensions (SGX) technology. The research addresses privacy concerns of users when outsourcing data mining needs to the cloud. These concerns have increased as more businesses evaluate data mining-as-an outsourced service due to lack of expertise or computation resources. The expected outcomes from the research will include new data privacy models, new privacy-preserving data mining algorithms, and a prototype of cloud data mining software. These will help businesses cut costs for data mining and privacy protection, and provide significant benefits toward helping Australia achieve its national cyber security strategy and potentially provide economic impact from commercialisation of new software technology for the industry partner.Read moreRead less
Cohort discovery and activity mining for policy impact prediction. Cohort discovery and activity mining for policy impact prediction. This project aims to develop an intelligent systematic framework to predict policy impacts on Australian patients, by discovering inherent patient cohorts and assessing the impact of the policies on these cohorts. The proposed methods lay the theoretical foundations for building intelligent automated tools for policy assessment. Expected outcomes are data-driven p ....Cohort discovery and activity mining for policy impact prediction. Cohort discovery and activity mining for policy impact prediction. This project aims to develop an intelligent systematic framework to predict policy impacts on Australian patients, by discovering inherent patient cohorts and assessing the impact of the policies on these cohorts. The proposed methods lay the theoretical foundations for building intelligent automated tools for policy assessment. Expected outcomes are data-driven patient group discovery, which could more precisely identify the patient cohorts most likely to benefit from a specific policy; and a model to predict the efficacy of policy options, which could increase the sustainability of the national health system by enabling smarter, more efficient policy decision-making.Read moreRead less
Automatic speech-based assessment of mental state via mobile device. This project aims to create the first mobile, device-based automatic assessment of mental state from acoustic speech. Focusing on novel approaches for eliciting speech, for regression-based scoring of mental state and for longitudinal modelling of speech, the project takes speech processing out of the laboratory and into realistic environments. The project is significant because elicitation approach and longitudinal modelling h ....Automatic speech-based assessment of mental state via mobile device. This project aims to create the first mobile, device-based automatic assessment of mental state from acoustic speech. Focusing on novel approaches for eliciting speech, for regression-based scoring of mental state and for longitudinal modelling of speech, the project takes speech processing out of the laboratory and into realistic environments. The project is significant because elicitation approach and longitudinal modelling have been acknowledged by the research community as challenges that are valuable to investigate, and because conventional regression methods are sub-optimal on ordinal mental state scales. This is significant commercially because mobile devices allow individually tailored, frequent and low-cost mental state assessment. Expected outcomes will include commercial-ready technology, trialled on Australians, accessible to everyone with a mobile device and concentration of Australian research and development capability in a rapidly growing application area.Read moreRead less
Mining large negative correlations for high-dimensional contrasting analysis. Negative correlations are widely embedded in real life applications, but in-depth research has rarely been conducted due to its high level of complexity. This project aims at efficient algorithms and frontier theory for finding large negative correlations, to enable smart information use in bioinformatics to promote Australia's leading role in data mining research.
Online Learning for Large Scale Structured Data in Complex Situations. Online Learning (OL) is the process of predicting answers for a sequence of questions. OL has enjoyed much attention in recent years due to its natural ability of processing large scale non-structured data and adapting to a changing environment. However, OL has three weaknesses: it does not scale for structured data; it often assumes that all of the data are equally important; it often considers that all of the data are compl ....Online Learning for Large Scale Structured Data in Complex Situations. Online Learning (OL) is the process of predicting answers for a sequence of questions. OL has enjoyed much attention in recent years due to its natural ability of processing large scale non-structured data and adapting to a changing environment. However, OL has three weaknesses: it does not scale for structured data; it often assumes that all of the data are equally important; it often considers that all of the data are complete and noise-free. These weaknesses limit its utility, because real data such as those that must be analysed in processing social networks, fraud detection do not satisfy the restrictions. The aim of this project is to develop theoretical and practical advances in OL that overcome the existing weaknesses.Read moreRead less
Coupling Learning in Big Data. Big data features complex coupling relationships within and between diverse entities in various forms and layers. This fundamentally challenges existing learning theories, which usually assume that data is independent and identically distributed (IID). This indicates that such IID tools may either be inapplicable for big data or capture an incomplete or even biased picture of the ground truth in big data. Hence, this project aims to invent breakthrough theories and ....Coupling Learning in Big Data. Big data features complex coupling relationships within and between diverse entities in various forms and layers. This fundamentally challenges existing learning theories, which usually assume that data is independent and identically distributed (IID). This indicates that such IID tools may either be inapplicable for big data or capture an incomplete or even biased picture of the ground truth in big data. Hence, this project aims to invent breakthrough theories and effective tools for systematically modelling and learning sophisticated couplings embedded in big data applications. The outcomes are expected to enhance Australia's leading role in data science research and lift data intelligence-driven productivity and economic growth in a changing world.Read moreRead less
Deep correction of DNA sequencing errors by data mining algorithms. This project aims to investigate the many layers of error correction problems in the terabytes of genomic sequence data, and aims to solve these problems by novel data mining algorithms. High-throughput sequencing platforms have generated massive amounts of useful raw data, but also made widespread errors. The new algorithms are capable of correcting errors at deeper layers to further enhance data quality. Expected outcome inclu ....Deep correction of DNA sequencing errors by data mining algorithms. This project aims to investigate the many layers of error correction problems in the terabytes of genomic sequence data, and aims to solve these problems by novel data mining algorithms. High-throughput sequencing platforms have generated massive amounts of useful raw data, but also made widespread errors. The new algorithms are capable of correcting errors at deeper layers to further enhance data quality. Expected outcome includes the knowledge advancement of genomic data industry and interdisciplinary collaboration between biotechnology and data mining. This also provides significant benefit for genomic decisions in forensics and personalised medicine which demand accurate genomic information.Read moreRead less
Multiview Complete Space Learning for Sparse Camera Network Research. Data analytics in video surveillance and social computing is a problem because data are represented by multiple heterogeneous features. This project will develop a multiview complete space learning framework to exploit heterogeneous properties to represent images obtained from sparse camera networks. It will integrate multiple features to identify people and understand behaviour, to build a database of activities occurring in ....Multiview Complete Space Learning for Sparse Camera Network Research. Data analytics in video surveillance and social computing is a problem because data are represented by multiple heterogeneous features. This project will develop a multiview complete space learning framework to exploit heterogeneous properties to represent images obtained from sparse camera networks. It will integrate multiple features to identify people and understand behaviour, to build a database of activities occurring in a wide area of surveillance. It will expand frontier technologies and safeguard Australia by providing warnings for hazardous (for example, overcrowding, trespassing), criminal, and terrorist situations. Results will be applicable internationally and enhance Australia’s role in machine learning and computer vision communities.Read moreRead less
Next-generation techniques for analysing massive data sets. To process enormous amounts of data, leading computing companies are turning to modern computing frameworks, for which little theory of efficient computational techniques has been developed. This project will resolve key theoretical questions and provide fast techniques for poorly understood pattern recognition and bioinformatics problems.
Discovery Early Career Researcher Award - Grant ID: DE140100679
Funder
Australian Research Council
Funding Amount
$395,220.00
Summary
Real-time query processing over multi-dimensional uncertain data streams. Real-time query processing of multi-dimensional uncertain data streams is fundamental in many applications such as environmental monitoring and location based services. This project aims to develop effective techniques to explore the massive multi-dimensional uncertain data streams in real time. The project will develop, analyse, implement and evaluate novel indexing and query processing techniques to effectively and effic ....Real-time query processing over multi-dimensional uncertain data streams. Real-time query processing of multi-dimensional uncertain data streams is fundamental in many applications such as environmental monitoring and location based services. This project aims to develop effective techniques to explore the massive multi-dimensional uncertain data streams in real time. The project will develop, analyse, implement and evaluate novel indexing and query processing techniques to effectively and efficiently support a set of primitive queries including rank-based queries, dominance-based queries and proximity-based queries. The results of this project will be an important complement to the development of data stream systems and will bring considerable social, economic and technological benefits to Australia.Read moreRead less