A Novel Approach to Semi-Supervised Statistical Machine Learning. Recent successes in the construction of classifiers for making diagnoses and predictions are due in part to their using much data labelled with respect to their class of origin. But typically there are little labelled data but plentiful unlabelled data. The goal of semi-supervised learning (SSL) is to leverage large amounts of unlabelled data to improve the performance using only small labelled datasets and so SSL is of paramount ....A Novel Approach to Semi-Supervised Statistical Machine Learning. Recent successes in the construction of classifiers for making diagnoses and predictions are due in part to their using much data labelled with respect to their class of origin. But typically there are little labelled data but plentiful unlabelled data. The goal of semi-supervised learning (SSL) is to leverage large amounts of unlabelled data to improve the performance using only small labelled datasets and so SSL is of paramount importance to applications where it is expensive or impractical to obtain much labelled data. The project is to develop a novel SSL approach that adopts a missingness mechanism for the missing labels to build a classifier that not only improves accuracy but it can be greater than if the missing labels were known.
Read moreRead less
Advanced Mixture Models for the Analysis of Modern-Day Data. Extracting key information from huge data sets is critical to the scientific successes of the future. This project will develop novel mixture models that can be used directly to analyse complex and high-dimensional data sets that may consist of thousands of variables observed on only a limited number of entities. In order to handle the challenging problems arising in the latter situation. This project develops mixtures of factor models ....Advanced Mixture Models for the Analysis of Modern-Day Data. Extracting key information from huge data sets is critical to the scientific successes of the future. This project will develop novel mixture models that can be used directly to analyse complex and high-dimensional data sets that may consist of thousands of variables observed on only a limited number of entities. In order to handle the challenging problems arising in the latter situation. This project develops mixtures of factor models with options for skew distributions that can be used to effectively analyse such data. Key applications include the domains of bioinformatics, biostatistics, business, data mining, economics, finance, image analysis, marketing, and personalised medicine, among many others.Read moreRead less
Joint clustering and matching of multivariate samples across objects. The project will provide a novel and very effective approach to the clustering of multivariate samples on objects, say patients, that automatically matches the sample clusters across the objects. A key application is the matching of biologically relevant cell subtypes across patients for use in the study and the clinical diagnosis and prognosis of cancer.
Expanding the role of mixture models in statistical analyses of big data. This project aims to develop theoretical procedures to scale inference and learning algorithms to analyse big data sets. It will develop analytic tools and algorithms to analyse big data sets which classical methods of inference cannot analyse directly due to the data’s complexity or size. This will accelerate the progress of scientific discovery and innovation, leading, for example, to new fields of inquiry; to an increas ....Expanding the role of mixture models in statistical analyses of big data. This project aims to develop theoretical procedures to scale inference and learning algorithms to analyse big data sets. It will develop analytic tools and algorithms to analyse big data sets which classical methods of inference cannot analyse directly due to the data’s complexity or size. This will accelerate the progress of scientific discovery and innovation, leading, for example, to new fields of inquiry; to an increase in understanding from studies on human and social processes and interactions; and to the promotion of economic growth and improved health and quality of life. Such applications should lead to breakthrough discoveries and innovation in science, engineering, medicine, commerce, education and national security.Read moreRead less
A new approach to fast matrix factorization for the statistical analysis of high-dimensional data. Some form of dimension reduction is essential in order to extract meaningful information from huge data sets. For this purpose we provide a novel and very fast approach to the factorization of the data matrix. It has wide applicability for improving the quality and validity of research in science and medicine and in most industries in Australia.
Large-Scale Statistical Inference: Multiple Testing. Multiple testing procedures are among the most important statistical tools for the analysis of modern data. This project aims to develop new methods for providing more powerful simultaneous tests while controlling the proportion of false positive conclusions. They are proposed to be derived by the novel pooling of information in individual attribute based contrasts to produce a Weighted Individual attribute-Specific Contrast (WISC) based stati ....Large-Scale Statistical Inference: Multiple Testing. Multiple testing procedures are among the most important statistical tools for the analysis of modern data. This project aims to develop new methods for providing more powerful simultaneous tests while controlling the proportion of false positive conclusions. They are proposed to be derived by the novel pooling of information in individual attribute based contrasts to produce a Weighted Individual attribute-Specific Contrast (WISC) based statistic. They will also exploit contextual information. They are expected to be of direct application to the problem of testing for no differences between two or more classes, as in the detection of differential expression in bioinformatics. Other key applications are expected to include biomedicine, economics, finance, genetics, and neuroscience.Read moreRead less
Statistical methodology for events on a network, with application to road safety. This project develops new methods to analyse road traffic accident rates, aiming to identify accident black spots and to develop an evidence base for future road design and road safety management. These methods can be applied to other types of events on a network of roads, railways, rivers, electrical wires, communication networks or airline routes.
New Developments for Bayesian statistical models and computational methods. Bayesian methods of statistical analysis provide a flexible theory for addressing inference in the presence of uncertainty. Consequently Bayesian methods have enabled scientific discovery in areas characterised as complex systems where new developments in modelling and computational methods have been crucial. Significant barriers to further success involve challenges in formulating and validating models, dealing with l ....New Developments for Bayesian statistical models and computational methods. Bayesian methods of statistical analysis provide a flexible theory for addressing inference in the presence of uncertainty. Consequently Bayesian methods have enabled scientific discovery in areas characterised as complex systems where new developments in modelling and computational methods have been crucial. Significant barriers to further success involve challenges in formulating and validating models, dealing with large data sets, and developing efficient computational methods. The principal aim of this project is to develop new Bayesian modelling and computational methodology which address these challenges with broad application.Read moreRead less
Complex data, model selection and bootstrap inference. The project will provide new statistical methods and associated software for the analysis and modelling of complex data, as well as quality research training. This project will benefit researchers in statistics and users of statistics who encounter the complex data considered in this project and who need to model and make inferences from these data. Since these kinds of data arise in many areas (such as medicine, genetics, chemistry etc), ....Complex data, model selection and bootstrap inference. The project will provide new statistical methods and associated software for the analysis and modelling of complex data, as well as quality research training. This project will benefit researchers in statistics and users of statistics who encounter the complex data considered in this project and who need to model and make inferences from these data. Since these kinds of data arise in many areas (such as medicine, genetics, chemistry etc), Australia and Australian industry will ultimately benefit from the proposed research. The strengthening of international link and the training of highly trained research scientists in an area of national importance will also benefit Australia.Read moreRead less
Innovations in Bayesian likelihood-free inference. Bayesian inference is a statistical method of choice in applied science. This project will develop innovative tools which permit Bayesian inference in problems considered intractable only a few years ago. These methods will expedite advances in multidisciplinary research across a range of applications. With these foundations, this project will accelerate national research efforts into improving frameworks for projecting trends in water availabil ....Innovations in Bayesian likelihood-free inference. Bayesian inference is a statistical method of choice in applied science. This project will develop innovative tools which permit Bayesian inference in problems considered intractable only a few years ago. These methods will expedite advances in multidisciplinary research across a range of applications. With these foundations, this project will accelerate national research efforts into improving frameworks for projecting trends in water availability and management, the impact of climate extremes, telecommunications engineering, HIV and infectious disease modelling and biostatistics. With many sectors unable to recruit appropriately trained statisticians within Australia, this project will train four PhD students in Bayesian statistics.
Read moreRead less