Next-generation techniques for analysing massive data sets. To process enormous amounts of data, leading computing companies are turning to modern computing frameworks, for which little theory of efficient computational techniques has been developed. This project will resolve key theoretical questions and provide fast techniques for poorly understood pattern recognition and bioinformatics problems.
Devising tools for big data sets to support computational movement analysis. This project aims to devise practical fundamental algorithms and multi-purpose data structures with performance guarantees for big spatio-temporal data sets. Systematic analysis of trajectory data has been occurring since the 1950s, but with the recent technological advances the size of the data sets has recently soared. Existing computational tools were developed for small to mid-size data sets. This project aims to d ....Devising tools for big data sets to support computational movement analysis. This project aims to devise practical fundamental algorithms and multi-purpose data structures with performance guarantees for big spatio-temporal data sets. Systematic analysis of trajectory data has been occurring since the 1950s, but with the recent technological advances the size of the data sets has recently soared. Existing computational tools were developed for small to mid-size data sets. This project aims to devise practical fundamental algorithms that will enable the development of domain specific tools for a wide range of applications, including sports, behavioural ecology, transport, and surveillance.Read moreRead less
A probabilistic framework for nonlinear dimensionality reduction algorithms. The Twin Measures Framework is a novel platform for analysing existing dimensionality reduction methods and the invention of new ones. This research will radically improve image analysis, with beneficial applications from pharmaceutical drug design through to border protection.
Novel data mining techniques for complex network analysis and control. This project will develop novel data mining theories and algorithms to analyse complex networks for safe information publishing and sharing across networks. It will enable smart information use in bioinformatics, social science and business intelligence, help protect against cybercrime and promote Australia's international research profile.
Mining multi-typed and dynamic graphs. Large volumes of data collected nowadays from real-world applications are often represented as graphs. The nodes and the edges of such graphs represent different types of entities and interactions, and they have time information. This project will develop algorithms that mine efficiently such multi-typed and dynamic graphs.
Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results th ....Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results that depend on the perceived state of the querier. This project will develop indexing and search techniques for massive structured data sets. The new search methods will incorporate theoretical advances and will be experimentally validated using industry-standard open-source distributed systems.Read moreRead less
Homomorphic cryptography: computing on encrypted data. This project is driven by the groundbreaking applications of a new cryptographic technology that allows analysis of encrypted (scrambled) data without needing to decrypt (unscramble) it first. The results of this project can be used to enable secure remote data storage, electronic auctions and voting, and protecting medical records.
Australian Laureate Fellowships - Grant ID: FL110100281
Funder
Australian Research Council
Funding Amount
$2,777,066.00
Summary
Large-scale statistical machine learning. This research program aims to develop the science behind statistical decision problems as varied as web retrieval, genomic data analysis and financial portfolio optimisation. Advances will have a very significant practical impact in the many areas of science and technology that need to make sense of large, complex data streams.
Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require c ....Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require computational techniques that run extremely efficiently. The project expects to develop and improve approximate data structures that operate in tight resource bounds. Anticipated outcomes are improved event recognition and dramatic speedup in analysis of streams in areas such as finance, health, transport, and urban data.Read moreRead less
Efficient and effective algorithms for searching strings in secondary storage. Pattern searching is fundamental to a wide range of computing applications, including web search and bioinformatics. In this project we will develop compression algorithms and hybrid memory-disk search structures that allow fast pattern matching on sequences of textual and numeric data, including when approximate search is required.