Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require c ....Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require computational techniques that run extremely efficiently. The project expects to develop and improve approximate data structures that operate in tight resource bounds. Anticipated outcomes are improved event recognition and dramatic speedup in analysis of streams in areas such as finance, health, transport, and urban data.Read moreRead less
Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results th ....Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results that depend on the perceived state of the querier. This project will develop indexing and search techniques for massive structured data sets. The new search methods will incorporate theoretical advances and will be experimentally validated using industry-standard open-source distributed systems.Read moreRead less
Efficient and effective algorithms for searching strings in secondary storage. Pattern searching is fundamental to a wide range of computing applications, including web search and bioinformatics. In this project we will develop compression algorithms and hybrid memory-disk search structures that allow fast pattern matching on sequences of textual and numeric data, including when approximate search is required.
On effectively modelling and efficiently discovering communities from large networks. Finding and maintaining close communities from very large scale, dynamically changing networks is interesting and challenging. This project aims to develop new techniques to identify such communities as fast as possible through exploiting the rich semantics and individual relationships within the communities.
Approximate proximity for applications in data mining and visualization. Data Mining, pattern recognition and visualization of relational information are all important data analysis techniques in which it is essential to determine which data points are in the vicinity of others. The huge size of the data sets involved and the need for real-time interaction preclude the use of conventional methods for the precise computation of the proximity information required. This project will develop efficie ....Approximate proximity for applications in data mining and visualization. Data Mining, pattern recognition and visualization of relational information are all important data analysis techniques in which it is essential to determine which data points are in the vicinity of others. The huge size of the data sets involved and the need for real-time interaction preclude the use of conventional methods for the precise computation of the proximity information required. This project will develop efficient algorithms and data structures for gathering high-quality approximations of the full proximity information, and will use these innovations as the basis for new, practical tools for visualization, and clustering in data mining.Read moreRead less
Efficient Algorithms for In-memory Sorting, Searching and Indexing on Modern Multi-core Cache-based and Graphics Processor Architectures. This project clearly belongs to one of the national research priority
goals, Smart Information Use. The copy-based techniques and work on sorting and searching will considerably impact the development of in-memory algorithms in cutting-edge computer architectures. Efficient suffix trees and suffix sorting have myriad applications in string-processing and will ....Efficient Algorithms for In-memory Sorting, Searching and Indexing on Modern Multi-core Cache-based and Graphics Processor Architectures. This project clearly belongs to one of the national research priority
goals, Smart Information Use. The copy-based techniques and work on sorting and searching will considerably impact the development of in-memory algorithms in cutting-edge computer architectures. Efficient suffix trees and suffix sorting have myriad applications in string-processing and will be of high interest to bioinformatics companies. The sortdex project will develop novel algorithms that will be used by enterprise search engine companies to develop applications for libraries and organisations dealing with large databases. Algorithms using the graphics processor as a co-processor have important applications in the high-growth field of computer graphics and games. Read moreRead less
Algorithmic engineering and complexity analysis of protocols for consensus. Opinions, rankings, observations, votes, gene sequences, sensor-networks in security systems or climate models. Massive datasets and the ability to share information at unprecedented speeds, makes finding the most central representative, the Consensus Problem, extremely complex. This research delivers new insights and new, efficient algorithms.
Algorithms and data structures to support automated analysis of trajectory data. The emergence of a variety of tracking devices, surveillance systems and even electronic transaction and phone networks has resulted in the production of large amounts of positional information for vehicles, people and animals. The aim of the project is to develop tools that support automated analysis of such data sets.