Reliable and accurate statistical solutions for modern complex data. This project aims to develop novel methods for reliable and accurate statistical modelling with modern, complex correlated and error-prone data. The project expects to make significant strides towards future-proofing statistical data analysis, equipping practitioners with a suite of robust and computationally efficient methods which provide confidence in the stability and reproducibility of results obtained, while offering guar ....Reliable and accurate statistical solutions for modern complex data. This project aims to develop novel methods for reliable and accurate statistical modelling with modern, complex correlated and error-prone data. The project expects to make significant strides towards future-proofing statistical data analysis, equipping practitioners with a suite of robust and computationally efficient methods which provide confidence in the stability and reproducibility of results obtained, while offering guarantees on their transferability over a range of populations. This will provide important benefits as they are applied in predicting endangered marine species for fisheries conservation, and in enhancing our national understanding of the relationship between education achievement and financial success. Read moreRead less
Discovery Early Career Researcher Award - Grant ID: DE240101190
Funder
Australian Research Council
Funding Amount
$451,000.00
Summary
Innovating and Validating Scalable Monte Carlo Methods. This project aims to develop innovative scalable Monte Carlo methods for statistical analysis in the presence of big data or complex mathematical models. Existing approaches to scalable Monte Carlo are only approximate, and their inaccuracies are difficult to quantify. This can have a detrimental impact on data-based decision making. The expected outcomes of this project are scalable Monte Carlo methods that are more accurate, fast and capa ....Innovating and Validating Scalable Monte Carlo Methods. This project aims to develop innovative scalable Monte Carlo methods for statistical analysis in the presence of big data or complex mathematical models. Existing approaches to scalable Monte Carlo are only approximate, and their inaccuracies are difficult to quantify. This can have a detrimental impact on data-based decision making. The expected outcomes of this project are scalable Monte Carlo methods that are more accurate, fast and capable of quantifying inaccuracies. Scientists and decision-makers will benefit from the ability to obtain timely, reliable insights for challenging applications.Read moreRead less
Stochastic majorization--minimization algorithms for data science. The changing nature of acquisition and storage data has made the process of drawing inference infeasible with traditional statistical and machine learning methods. Modern data are often acquired in real time, in an incremental nature, and are often available in too large a volume to process on conventional machinery. The project proposes to study the family of stochastic majorisation-minimisation algorithms for computation of inf ....Stochastic majorization--minimization algorithms for data science. The changing nature of acquisition and storage data has made the process of drawing inference infeasible with traditional statistical and machine learning methods. Modern data are often acquired in real time, in an incremental nature, and are often available in too large a volume to process on conventional machinery. The project proposes to study the family of stochastic majorisation-minimisation algorithms for computation of inferential quantities in an incremental manner. The proposed stochastic algorithms encompass and extend upon a wide variety of current algorithmic frameworks for fitting statistical and machine learning models, and can be used to produce feasible and practical algorithms for complex models, both current and future.
Read moreRead less
Surveillance and sampling to maintain absence of pests and diseases. This project aims to develop empirically validated statistical and mathematical methods for industry and government to deliver more efficient biosecurity surveillance programs. The project endeavours to enhance biosecurity at the border and within Australia, while minimising the costs and burden of testing. Expected project outcomes include effective surveillance and sampling for high-priority threats, accessible software for d ....Surveillance and sampling to maintain absence of pests and diseases. This project aims to develop empirically validated statistical and mathematical methods for industry and government to deliver more efficient biosecurity surveillance programs. The project endeavours to enhance biosecurity at the border and within Australia, while minimising the costs and burden of testing. Expected project outcomes include effective surveillance and sampling for high-priority threats, accessible software for decision-makers, and generalisable approaches to address rapidly increasing biosecurity risks. Significant benefits include maintaining absence of key pathogens and pests in Australia.Read moreRead less
A Novel Approach to Semi-Supervised Statistical Machine Learning. Recent successes in the construction of classifiers for making diagnoses and predictions are due in part to their using much data labelled with respect to their class of origin. But typically there are little labelled data but plentiful unlabelled data. The goal of semi-supervised learning (SSL) is to leverage large amounts of unlabelled data to improve the performance using only small labelled datasets and so SSL is of paramount ....A Novel Approach to Semi-Supervised Statistical Machine Learning. Recent successes in the construction of classifiers for making diagnoses and predictions are due in part to their using much data labelled with respect to their class of origin. But typically there are little labelled data but plentiful unlabelled data. The goal of semi-supervised learning (SSL) is to leverage large amounts of unlabelled data to improve the performance using only small labelled datasets and so SSL is of paramount importance to applications where it is expensive or impractical to obtain much labelled data. The project is to develop a novel SSL approach that adopts a missingness mechanism for the missing labels to build a classifier that not only improves accuracy but it can be greater than if the missing labels were known.
Read moreRead less
Technology-Driven and Scalable Regression Methodology, Computing and Theory. Regression is a mainstay of data analysis, statistics, machine learning and data science but is in continual need of enhancement in the face of technological change. Scalability and flexibility for the handling of non-linear signals are fundamental to the practical utility of new regression methodology. Several streams of research aimed at confronting data from specific technologies as well as generic types of data are ....Technology-Driven and Scalable Regression Methodology, Computing and Theory. Regression is a mainstay of data analysis, statistics, machine learning and data science but is in continual need of enhancement in the face of technological change. Scalability and flexibility for the handling of non-linear signals are fundamental to the practical utility of new regression methodology. Several streams of research aimed at confronting data from specific technologies as well as generic types of data are proposed. The project is to be networked with researchers in the United States of America and aims to have Australia-based researchers providing leadership in terms of methodological, theoretical, computational and software development.Read moreRead less
Self-Interacting Random Walks. This project aims to study the growth properties of a class of self-interacting processes defined on Euclidean lattices. This project expects to determine whether a shape theorem holds for once-reinforced random walks, and establish conditions for their recurrence/transience. It also expects to obtain new and very precise estimates for the local time of simple random walks. Expected outcomes of this project include solving long-standing open problems in the field o ....Self-Interacting Random Walks. This project aims to study the growth properties of a class of self-interacting processes defined on Euclidean lattices. This project expects to determine whether a shape theorem holds for once-reinforced random walks, and establish conditions for their recurrence/transience. It also expects to obtain new and very precise estimates for the local time of simple random walks. Expected outcomes of this project include solving long-standing open problems in the field of reinforced random walks, and the development of novel methods for their study. This should provide significant benefits not only to the field of mathematics, but also to the myriad of applied disciplines where self-interacting processes are utilised.Read moreRead less
Mitigating bias in statistical analyses of data collected over time. This project aims to develop innovative nonparametric distribution and regression curve estimation techniques from data collected over time. These curves are key statistical tools for describing populations, but often, their estimators are inefficient when the data are massive, growing and change over time, or too restrictive when the data exhibit measurement errors and a fraction of them are equal to zero. The project expects ....Mitigating bias in statistical analyses of data collected over time. This project aims to develop innovative nonparametric distribution and regression curve estimation techniques from data collected over time. These curves are key statistical tools for describing populations, but often, their estimators are inefficient when the data are massive, growing and change over time, or too restrictive when the data exhibit measurement errors and a fraction of them are equal to zero. The project expects to develop novel, less restrictive and more realistic nonparametric curve estimation methods in these complex settings. Outcomes include new practical statistical methods and software to benefit experts in diverse fields from nutrition and epidemiology, to environmental science and digital platforms, amongst others.Read moreRead less
Modern statistical methods for clustering community ecology data. This project will develop statistical methods and software for clustering community ecology data, and use them to analyse systematic survey and citizen science program data collected along the Great Barrier Reef. By doing so, the project will address the dearth of statistical classification techniques for high-dimensional, multi-response data with complex relationships. When the resultant clustering methods are used to construct b ....Modern statistical methods for clustering community ecology data. This project will develop statistical methods and software for clustering community ecology data, and use them to analyse systematic survey and citizen science program data collected along the Great Barrier Reef. By doing so, the project will address the dearth of statistical classification techniques for high-dimensional, multi-response data with complex relationships. When the resultant clustering methods are used to construct bioregions and characterise species’ environmental responses, they should significantly enhance evaluations of the impact of human activity and environmental change on coral diversity. Ultimately, these evaluations can underpin future decisions in the conservation and management of the Great Barrier Reef.Read moreRead less