Unsupervised learning of finite mixture models in data mining applications. The extraction of useful information from massively large databases is known as data mining. Its broad but vague goal is to find "interesting structure" in the data, which typically leads to breaking the data into clusters. To this end, we consider the fast, efficient, and automatic learning of finite mixture models in hugh data sets without any prior knowledge of the structure. This probabilistic approach to the discove ....Unsupervised learning of finite mixture models in data mining applications. The extraction of useful information from massively large databases is known as data mining. Its broad but vague goal is to find "interesting structure" in the data, which typically leads to breaking the data into clusters. To this end, we consider the fast, efficient, and automatic learning of finite mixture models in hugh data sets without any prior knowledge of the structure. This probabilistic approach to the discovery and validation of group structure in data mining applications will considerably enhance knowledge management and decision support in science, industry, and government.
Read moreRead less
Australian Laureate Fellowships - Grant ID: FL140100012
Funder
Australian Research Council
Funding Amount
$2,830,000.00
Summary
Stress-testing algorithms: generating new test instances to elicit insights. Stress-testing algorithms: generating new test instances to elicit insights. This project aims to develop a new paradigm in algorithm testing, creating novel test instances and tools to elicit insights into algorithm strengths and weaknesses. Such advances are urgently needed to support good research practice in academia, and to avoid disasters when deploying algorithms in practice. Extending our recent work in algorith ....Stress-testing algorithms: generating new test instances to elicit insights. Stress-testing algorithms: generating new test instances to elicit insights. This project aims to develop a new paradigm in algorithm testing, creating novel test instances and tools to elicit insights into algorithm strengths and weaknesses. Such advances are urgently needed to support good research practice in academia, and to avoid disasters when deploying algorithms in practice. Extending our recent work in algorithm testing for combinatorial optimisation, described as 'ground-breaking,' this project aims to tackle the challenges needed to generalise the paradigm to other fields such as machine learning, forecasting, software testing, and other branches of optimisation. An online repository of test instances and tools aim to provide a valuable resource to improve research practice and support new insights into algorithm performance.Read moreRead less
Complex data, model selection and bootstrap inference. The project will provide new statistical methods and associated software for the analysis and modelling of complex data, as well as quality research training. This project will benefit researchers in statistics and users of statistics who encounter the complex data considered in this project and who need to model and make inferences from these data. Since these kinds of data arise in many areas (such as medicine, genetics, chemistry etc), ....Complex data, model selection and bootstrap inference. The project will provide new statistical methods and associated software for the analysis and modelling of complex data, as well as quality research training. This project will benefit researchers in statistics and users of statistics who encounter the complex data considered in this project and who need to model and make inferences from these data. Since these kinds of data arise in many areas (such as medicine, genetics, chemistry etc), Australia and Australian industry will ultimately benefit from the proposed research. The strengthening of international link and the training of highly trained research scientists in an area of national importance will also benefit Australia.Read moreRead less
Innovations in Bayesian likelihood-free inference. Bayesian inference is a statistical method of choice in applied science. This project will develop innovative tools which permit Bayesian inference in problems considered intractable only a few years ago. These methods will expedite advances in multidisciplinary research across a range of applications. With these foundations, this project will accelerate national research efforts into improving frameworks for projecting trends in water availabil ....Innovations in Bayesian likelihood-free inference. Bayesian inference is a statistical method of choice in applied science. This project will develop innovative tools which permit Bayesian inference in problems considered intractable only a few years ago. These methods will expedite advances in multidisciplinary research across a range of applications. With these foundations, this project will accelerate national research efforts into improving frameworks for projecting trends in water availability and management, the impact of climate extremes, telecommunications engineering, HIV and infectious disease modelling and biostatistics. With many sectors unable to recruit appropriately trained statisticians within Australia, this project will train four PhD students in Bayesian statistics.
Read moreRead less
Statistical methods for analysing multi-source microarray data and building gene regulatory networks. I will devise a statistical learning technique that does not force a gene to be assigned to exactly one category. This technique reflects the biological reality that a gene can belong to two or more functional categories. Therefore, the new technique will improve a model's ability to identify regulatory genes in different types of cancer; these regulatory genes can be targeted by new anti-cancer ....Statistical methods for analysing multi-source microarray data and building gene regulatory networks. I will devise a statistical learning technique that does not force a gene to be assigned to exactly one category. This technique reflects the biological reality that a gene can belong to two or more functional categories. Therefore, the new technique will improve a model's ability to identify regulatory genes in different types of cancer; these regulatory genes can be targeted by new anti-cancer drugs resulting in a more effective treatment. I will model gene regulatory networks using microarray data from multiple sources. These networks will be used to identify regulatory cliques - a group of genes that are vital for a cellular function. This will improve our understanding of debilitating conditions such as asthma.Read moreRead less
Modelling mean and dispersion using fixed and random effects. The aims of the project are to develop methods for joint mean and dispersion modelling using fixed and random effects, in the generalized linear models context and for Gaussian longitudinal data. The significance is the more efficient, precise and appropriate analysis of data arising from many areas of application. The expected outcomes are therefore better methods of analysis, software to carry the analyses out, and potentially impor ....Modelling mean and dispersion using fixed and random effects. The aims of the project are to develop methods for joint mean and dispersion modelling using fixed and random effects, in the generalized linear models context and for Gaussian longitudinal data. The significance is the more efficient, precise and appropriate analysis of data arising from many areas of application. The expected outcomes are therefore better methods of analysis, software to carry the analyses out, and potentially important results in applications.Read moreRead less
Bootstrap methods for data with multiple errors. This project will provide new methods for data analysis and quality research training. The results will benefit researchers in statistics and users of statistics who encounter data with multiple errors and who need to make inferences from these data. The many areas from which such data arise (including medicine, genetics, chemistry, education, social surveys etc) mean that Australia and Australian Industry will also ultimately benefit from the r ....Bootstrap methods for data with multiple errors. This project will provide new methods for data analysis and quality research training. The results will benefit researchers in statistics and users of statistics who encounter data with multiple errors and who need to make inferences from these data. The many areas from which such data arise (including medicine, genetics, chemistry, education, social surveys etc) mean that Australia and Australian Industry will also ultimately benefit from the research. The strengthening of international links and the training of highly trained researchers will also benefit the Australian community.Read moreRead less
Prof Speed is a statistician specializing in bioinformatics and computational biology, applying my skills in support of basic research in molecular and cell biology and genetics.
International Networks in Applied Bayesian Statistics: improving Australia''s knowledge through intelligent data analysis and modelling. National benefits of this project are fourfold: (i) new international networks between Australia, Southern Africa, France and USA in the priority area of mathematical sciences; (ii) state-of-the-art Bayesian statistical methods for integrating and analyzing non-standard data and diverse information sources, including expert opinion, in order to solve complex pr ....International Networks in Applied Bayesian Statistics: improving Australia''s knowledge through intelligent data analysis and modelling. National benefits of this project are fourfold: (i) new international networks between Australia, Southern Africa, France and USA in the priority area of mathematical sciences; (ii) state-of-the-art Bayesian statistical methods for integrating and analyzing non-standard data and diverse information sources, including expert opinion, in order to solve complex problems in environment, industry, health, defence; (iii) direct contribution to solution of global environmental problems, specifically water quality, threatened species and environmental risk; (iv) superior training of the next generation of the global community of researchers in applied statistics.Read moreRead less
I am a statistician specializing in bioinformatics and computational biology, applying my skills in support of basic research in molecular and cell biology and genetics.