Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the ext ....Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the extent of such biases, and develop models that are both more socially equitable, as well as less prone to expose private data in the learned representations. In doing so, it will make NLP more accessible to new populations of users, and remove socio-technological barriers to NLP uptake.Read moreRead less
Discovery Early Career Researcher Award - Grant ID: DE220100188
Funder
Australian Research Council
Funding Amount
$438,582.00
Summary
Generating Plots with Dialogue Based Executable Semantic Parsing. This project aims to address the limited abilities of dialogue systems by developing new models and data collection techniques. The project expects to address a major gap in Natural Language Processing using a model that generates computer code and updates it in response to user requests. Expected outcomes of this project include a system that interacts with a user in plain English to analyse data, and efficient methods of trainin ....Generating Plots with Dialogue Based Executable Semantic Parsing. This project aims to address the limited abilities of dialogue systems by developing new models and data collection techniques. The project expects to address a major gap in Natural Language Processing using a model that generates computer code and updates it in response to user requests. Expected outcomes of this project include a system that interacts with a user in plain English to analyse data, and efficient methods of training the system with minimal expert input. This should provide significant benefits to research and business by broadening the accessibility and efficiency of data analysis, enabling faster and wiser decisions.Read moreRead less
Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approa ....Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approach based on network analysis of database record connectivity. These tools will enable quantifying data quality at scale. Researchers, evidence-based decision-makers in biomedicine, and the analytical or predictive tools that use this data will make more reliable inferences and decisions.Read moreRead less
Discovery Early Career Researcher Award - Grant ID: DE120100508
Funder
Australian Research Council
Funding Amount
$375,000.00
Summary
A framework for building dynamic knowledge bases in the biomedical domain. This project will provide clinicians and researchers with a semantics and time-aware technique, which will help them work together to build and maintain the knowledge required to support a better management and understanding of the mechanisms (for example, gene mutations) that affect diseases in any biomedical domain.
Discovery Early Career Researcher Award - Grant ID: DE120102900
Funder
Australian Research Council
Funding Amount
$375,000.00
Summary
WikiLinks: web-scale linking and fact extraction with Wikipedia. Wikipedia is the most popular web site for finding facts, but articles about local or specialist topics are often missing or unreliable. WikiLinks will use artificial intelligence to link names in text to corresponding Wikipedia articles, allowing us to automatically create and augment Wikipedia content by summarising existing material on the web.
Personalised topic modelling and sentiment analysis for enhanced information discovery over document streams. This project will develop personalised information discovery, navigation and management systems of online content for the creative industries, e.g. to help advertising agencies understand market trends, and enable designers to discover and analyse information relating to new product concepts.
Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce th ....Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce the dramatic inconsistencies in protein information resources by leveraging observed correlations and complementarity between them, and specifically through targeted fact extraction from the biomedical literature. These methods will be applied at scale across millions of published articles, to infer and validate functional information.Read moreRead less
AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across user ....AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across users of all backgrounds, generalisation from small amounts of curated data, and dynamic interaction with the help-seeker. The project will help deliver legal assistance to some of the most vulnerable members of Australian society, and reinforce Australia's position as a world leader in AI for Law.Read moreRead less
Talking about place: tapping human knowledge to enrich national spatial data sets. Place descriptions are a common way for people to describe a location, but no current tools are smart enough to understand them. Emergency call centres are risking lives, users of navigation or web services are frustrated and addressing these problems costs billions of dollars per year. This project comes with a novel, interdisciplinary approach to automatically interpret human place descriptions and will develop ....Talking about place: tapping human knowledge to enrich national spatial data sets. Place descriptions are a common way for people to describe a location, but no current tools are smart enough to understand them. Emergency call centres are risking lives, users of navigation or web services are frustrated and addressing these problems costs billions of dollars per year. This project comes with a novel, interdisciplinary approach to automatically interpret human place descriptions and will develop novel methods to capture placenames with their meaning for smarter databases and automatic interpretation procedures. This acquired knowledge will be an important step forward for Australia's data custodians and users. Australia's location information industry will gain a significant advantage on a highly competitive global market.Read moreRead less