Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require c ....Approximate structures for efficient processing of data streams. This project aims to increase the volume of streamed data that can be handled on a low-powered device with limited memory. In finance, health, and transport, data arrives at enormous rates, and data-driven decisions must be made quickly. Likewise, to keep Australia secure, national agencies monitor and gather vast data sets. Increasingly, devices and monitors that have limited resources are making these decisions and they require computational techniques that run extremely efficiently. The project expects to develop and improve approximate data structures that operate in tight resource bounds. Anticipated outcomes are improved event recognition and dramatic speedup in analysis of streams in areas such as finance, health, transport, and urban data.Read moreRead less
Searching Cohesive Subgraphs in Big Attributed Graph Data. The availability of big attributed graph data brings great opportunities for realizing big values of data. Making sense of such big attributed graph data finds many applications, including health, science, engineering, business, environment, etc. A cohesive subgraph, one of key components that captures the latent properties in a graph, is essential to graph analysis. This project aims to invent effective models of cohesive subgraphs and ....Searching Cohesive Subgraphs in Big Attributed Graph Data. The availability of big attributed graph data brings great opportunities for realizing big values of data. Making sense of such big attributed graph data finds many applications, including health, science, engineering, business, environment, etc. A cohesive subgraph, one of key components that captures the latent properties in a graph, is essential to graph analysis. This project aims to invent effective models of cohesive subgraphs and efficient algorithms for searching and monitoring cohesive subgraphs in big and dynamic attributed graphs from both structure and attribute perspectives. The methods, techniques, and prototype systems developed in this project can be deployed to facilitate the smart use of big graph data across the nation. Read moreRead less
Modelling and Searching Cohesive Groups over Heterogeneous Graphs . Heterogeneous information networks (HINs) contain richer structural and semantic information represented as different types of objects and links. Searching cohesive groups from HINs finds many applications and also brings challenges at both conceptual and technical levels. This project aims to investigate the effective modelling of cohesive groups that take both homogeneous and heterogeneous information into account for differen ....Modelling and Searching Cohesive Groups over Heterogeneous Graphs . Heterogeneous information networks (HINs) contain richer structural and semantic information represented as different types of objects and links. Searching cohesive groups from HINs finds many applications and also brings challenges at both conceptual and technical levels. This project aims to investigate the effective modelling of cohesive groups that take both homogeneous and heterogeneous information into account for different applications and devise efficient algorithms for searching and monitoring those cohesive groups based on different models. The methods, techniques, and evaluation systems developed in this project can be deployed to facilitate the smart use of heterogeneous information networks across the nation.Read moreRead less
Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results th ....Data retrieval from massive information structures. Information search is an essential tool. But most current services regard the data as unstructured collections of independent documents, free of context. Next-generation search applications, such as over social networks, or corporate websites, or XML data sets, must account for the inherent relationships between data items, and must allow the efficient inclusion of search context. Queries should favour semantically local data, giving results that depend on the perceived state of the querier. This project will develop indexing and search techniques for massive structured data sets. The new search methods will incorporate theoretical advances and will be experimentally validated using industry-standard open-source distributed systems.Read moreRead less
On Effectively Answering Why and Why-not Questions in Databases. While the performance and functionality of database systems have gained dramatic improvement, research on improving usability still remains far behind, which results in huge cost of technical support to organisations. This project aims to improve the usability of database systems by effectively answering users' why and why-not questions on query results. This project will invent a novel and generalised model for expressing both the ....On Effectively Answering Why and Why-not Questions in Databases. While the performance and functionality of database systems have gained dramatic improvement, research on improving usability still remains far behind, which results in huge cost of technical support to organisations. This project aims to improve the usability of database systems by effectively answering users' why and why-not questions on query results. This project will invent a novel and generalised model for expressing both the why and why-not questions, efficient strategies for answering questions for complex queries and databases, and novel solutions to scenarios that involve multiple queries. The project will contribute greatly to the fundamental research in query refinement and deliver significant impact on related technology development. Read moreRead less
Identifying and Tracking Influential Events in Large Social Networks. This project aims to invent a novel model and techniques for identifying and tracking influential events in large and dynamic social networks in real time. The proposed model would take into account the structure and content of social networks, and the influence of events. The project also plans to develop efficient strategies for identifying and tracking events in large and dynamic social network environments based on the mod ....Identifying and Tracking Influential Events in Large Social Networks. This project aims to invent a novel model and techniques for identifying and tracking influential events in large and dynamic social networks in real time. The proposed model would take into account the structure and content of social networks, and the influence of events. The project also plans to develop efficient strategies for identifying and tracking events in large and dynamic social network environments based on the model, In particular, the project plans to investigate flexible social network query methods to make users’ event search easy. Finally the project plans to build an evaluation system to demonstrate the efficiency of the algorithms and effectiveness of the model.Read moreRead less
Biclique discovery in Big Data. This project aims to design algorithms to capture Big Data. Biclique is a popular graph model that can capture important cohesive structures in many applications. However, traditional biclique discovery algorithms which only focus on simple, small-scale, static and deterministic data are inadequate in the era of Big Data where data has Variety (various formats), Volume (large quantity), Velocity (dynamic update) and Veracity (uncertainty). This project expects to ....Biclique discovery in Big Data. This project aims to design algorithms to capture Big Data. Biclique is a popular graph model that can capture important cohesive structures in many applications. However, traditional biclique discovery algorithms which only focus on simple, small-scale, static and deterministic data are inadequate in the era of Big Data where data has Variety (various formats), Volume (large quantity), Velocity (dynamic update) and Veracity (uncertainty). This project expects to benefit real applications in both public and private sectors and add value to Australian manufactured products.Read moreRead less
On effectively modelling and efficiently discovering communities from large networks. Finding and maintaining close communities from very large scale, dynamically changing networks is interesting and challenging. This project aims to develop new techniques to identify such communities as fast as possible through exploiting the rich semantics and individual relationships within the communities.
Privacy-preserving record linkage on multiple large databases. Record linkage has been recognised as a crucial infrastructure component in many information systems, however privacy concerns commonly prevent the linking of databases that contain personal information. This project will develop techniques that will enable the linking of multiple large databases without revealing any private information.
Making sense of trajectory data: a database approach. This project investigates new challenges related to providing functionality, flexibility and efficiency for large scale trajectory data management and processing. The expected outcome includes significant technical contributions in novel indexing structures and advanced query processing methods for making better use of rich trajectory data.