Fast effective clustering technologies for highly dynamic massive networks. Clustering is a fundamental data mining and analysis task. In an interconnected evolving world, friendships and information flows are modelled as large dynamic networks. Structural clustering and correlation clustering are important and well-studied approaches for static networks; for evolving networks, where links appear and disappear over time, we lack efficient techniques. Anticipated outcomes are new practical cluste ....Fast effective clustering technologies for highly dynamic massive networks. Clustering is a fundamental data mining and analysis task. In an interconnected evolving world, friendships and information flows are modelled as large dynamic networks. Structural clustering and correlation clustering are important and well-studied approaches for static networks; for evolving networks, where links appear and disappear over time, we lack efficient techniques. Anticipated outcomes are new practical clustering algorithms for dynamic networks – with performance guarantees of efficiency and clustering quality – and prototype software, guiding us to pick a good clustering. Expected benefits include better understanding of spread in evolving social networks, accelerating the software testing cycle, and improved topic detection.Read moreRead less
Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require c ....Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require computational techniques that run extremely efficiently. The project expects to develop and improve approximate data structures that operate in tight resource bounds. Anticipated outcomes are improved event recognition and dramatic speedup in analysis of streams in areas such as finance, health, transport, and urban data.Read moreRead less
Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results th ....Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results that depend on the perceived state of the querier. This project will develop indexing and search techniques for massive structured data sets. The new search methods will incorporate theoretical advances and will be experimentally validated using industry-standard open-source distributed systems.Read moreRead less
Efficient and effective algorithms for searching strings in secondary storage. Pattern searching is fundamental to a wide range of computing applications, including web search and bioinformatics. In this project we will develop compression algorithms and hybrid memory-disk search structures that allow fast pattern matching on sequences of textual and numeric data, including when approximate search is required.
On effectively modelling and efficiently discovering communities from large networks. Finding and maintaining close communities from very large scale, dynamically changing networks is interesting and challenging. This project aims to develop new techniques to identify such communities as fast as possible through exploiting the rich semantics and individual relationships within the communities.
Algorithms for Future-Proof Networks. This project will design algorithms to construct, augment and route on geometric graphs in the presence of obstacles. Such graphs have many real-world applications, including transport networks. This project aims to give solutions with hard guarantees on the timeliness of the delivery of the people, goods, or information being transported in these networks. Expected outcomes of this project include efficient and innovative algorithms for realistic geometric ....Algorithms for Future-Proof Networks. This project will design algorithms to construct, augment and route on geometric graphs in the presence of obstacles. Such graphs have many real-world applications, including transport networks. This project aims to give solutions with hard guarantees on the timeliness of the delivery of the people, goods, or information being transported in these networks. Expected outcomes of this project include efficient and innovative algorithms for realistic geometric graphs, which both advances the knowledge in this field of computer science and make our existing networks more reliable. This should provide significant benefits in the maintenance and utilisation of the communication and transport networks we use every day.Read moreRead less
Efficient Algorithms for In-memory Sorting, Searching and Indexing on Modern Multi-core Cache-based and Graphics Processor Architectures. This project clearly belongs to one of the national research priority
goals, Smart Information Use. The copy-based techniques and work on sorting and searching will considerably impact the development of in-memory algorithms in cutting-edge computer architectures. Efficient suffix trees and suffix sorting have myriad applications in string-processing and will ....Efficient Algorithms for In-memory Sorting, Searching and Indexing on Modern Multi-core Cache-based and Graphics Processor Architectures. This project clearly belongs to one of the national research priority
goals, Smart Information Use. The copy-based techniques and work on sorting and searching will considerably impact the development of in-memory algorithms in cutting-edge computer architectures. Efficient suffix trees and suffix sorting have myriad applications in string-processing and will be of high interest to bioinformatics companies. The sortdex project will develop novel algorithms that will be used by enterprise search engine companies to develop applications for libraries and organisations dealing with large databases. Algorithms using the graphics processor as a co-processor have important applications in the high-growth field of computer graphics and games. Read moreRead less
XML Views of Relational Databases: Semantics and Update Problems. XML is the standard for representing, publishing and exchanging data over the Internet and relational database is the dominant technology for data management. Updating XML views over relational data is fundamental to bring these two technologies together to serve Internet-based applications. Australia has been a leading country in both developing and applying internet technologies. The theoretic outcomes of this project will contr ....XML Views of Relational Databases: Semantics and Update Problems. XML is the standard for representing, publishing and exchanging data over the Internet and relational database is the dominant technology for data management. Updating XML views over relational data is fundamental to bring these two technologies together to serve Internet-based applications. Australia has been a leading country in both developing and applying internet technologies. The theoretic outcomes of this project will contribute to the advance in database and web research communities and establish us as an internationally leading group in this research area. The technological outcomes will help organisations in Australia effectively and efficiently conduct e-Business on the Internet. Read moreRead less
Attribution of Machine-generated Code for Accountability. Machine-generated (or neural) code is usually produced by AI tools to speed up software development. However, such codes have recently raised serious security and privacy concerns. This project aims to attribute these codes to their generative models for accountability purposes. In the process, a series of new techniques are developed to differentiate between the codes generated by different models. The outcomes include analysis of neural ....Attribution of Machine-generated Code for Accountability. Machine-generated (or neural) code is usually produced by AI tools to speed up software development. However, such codes have recently raised serious security and privacy concerns. This project aims to attribute these codes to their generative models for accountability purposes. In the process, a series of new techniques are developed to differentiate between the codes generated by different models. The outcomes include analysis of neural code fingerprints, classification of neural codes, and theories to verify the correctness of code attribution. These will provide significant benefits, ranging from copyright protection to privacy preservation. This project is timely since currently the software community is pervasively using neural codes.Read moreRead less
Searching Cohesive Subgraphs in Big Attributed Graph Data. The availability of big attributed graph data brings great opportunities for realizing big values of data. Making sense of such big attributed graph data finds many applications, including health, science, engineering, business, environment, etc. A cohesive subgraph, one of key components that captures the latent properties in a graph, is essential to graph analysis. This project aims to invent effective models of cohesive subgraphs and ....Searching Cohesive Subgraphs in Big Attributed Graph Data. The availability of big attributed graph data brings great opportunities for realizing big values of data. Making sense of such big attributed graph data finds many applications, including health, science, engineering, business, environment, etc. A cohesive subgraph, one of key components that captures the latent properties in a graph, is essential to graph analysis. This project aims to invent effective models of cohesive subgraphs and efficient algorithms for searching and monitoring cohesive subgraphs in big and dynamic attributed graphs from both structure and attribute perspectives. The methods, techniques, and prototype systems developed in this project can be deployed to facilitate the smart use of big graph data across the nation. Read moreRead less