Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the ext ....Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the extent of such biases, and develop models that are both more socially equitable, as well as less prone to expose private data in the learned representations. In doing so, it will make NLP more accessible to new populations of users, and remove socio-technological barriers to NLP uptake.Read moreRead less
Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approa ....Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approach based on network analysis of database record connectivity. These tools will enable quantifying data quality at scale. Researchers, evidence-based decision-makers in biomedicine, and the analytical or predictive tools that use this data will make more reliable inferences and decisions.Read moreRead less
Personalised topic modelling and sentiment analysis for enhanced information discovery over document streams. This project will develop personalised information discovery, navigation and management systems of online content for the creative industries, e.g. to help advertising agencies understand market trends, and enable designers to discover and analyse information relating to new product concepts.
Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce th ....Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce the dramatic inconsistencies in protein information resources by leveraging observed correlations and complementarity between them, and specifically through targeted fact extraction from the biomedical literature. These methods will be applied at scale across millions of published articles, to infer and validate functional information.Read moreRead less
AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across user ....AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across users of all backgrounds, generalisation from small amounts of curated data, and dynamic interaction with the help-seeker. The project will help deliver legal assistance to some of the most vulnerable members of Australian society, and reinforce Australia's position as a world leader in AI for Law.Read moreRead less
Talking about place: tapping human knowledge to enrich national spatial data sets. Place descriptions are a common way for people to describe a location, but no current tools are smart enough to understand them. Emergency call centres are risking lives, users of navigation or web services are frustrated and addressing these problems costs billions of dollars per year. This project comes with a novel, interdisciplinary approach to automatically interpret human place descriptions and will develop ....Talking about place: tapping human knowledge to enrich national spatial data sets. Place descriptions are a common way for people to describe a location, but no current tools are smart enough to understand them. Emergency call centres are risking lives, users of navigation or web services are frustrated and addressing these problems costs billions of dollars per year. This project comes with a novel, interdisciplinary approach to automatically interpret human place descriptions and will develop novel methods to capture placenames with their meaning for smarter databases and automatic interpretation procedures. This acquired knowledge will be an important step forward for Australia's data custodians and users. Australia's location information industry will gain a significant advantage on a highly competitive global market.Read moreRead less