Personalised topic modelling and sentiment analysis for enhanced information discovery over document streams. This project will develop personalised information discovery, navigation and management systems of online content for the creative industries, e.g. to help advertising agencies understand market trends, and enable designers to discover and analyse information relating to new product concepts.
Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the ext ....Fairness in Natural Language Processing. Natural language processing (NLP) has achieved spectacular commercial successes in recent years, and has been deployed across an ever-increasing breadth of devices and application areas. At the same time, there has been stark evidence to indicate that naively-trained models amplify biases in training data, and perform inconsistently across text relating to different demographic groupings of individuals. This project aims to systematically quantify the extent of such biases, and develop models that are both more socially equitable, as well as less prone to expose private data in the learned representations. In doing so, it will make NLP more accessible to new populations of users, and remove socio-technological barriers to NLP uptake.Read moreRead less
Explaining the outcomes of complex computational models. This project aims to develop new algorithms that automatically generate explanations for the results produced by complex computational models. In recent times, these models have become increasingly accurate, and hence pervasive. However, the reasoning of Deep Neural Networks and Bayesian Networks, and of complex Regression models and Decision Trees is often unclear, impairing effective decision making by practitioners who use the results o ....Explaining the outcomes of complex computational models. This project aims to develop new algorithms that automatically generate explanations for the results produced by complex computational models. In recent times, these models have become increasingly accurate, and hence pervasive. However, the reasoning of Deep Neural Networks and Bayesian Networks, and of complex Regression models and Decision Trees is often unclear, impairing effective decision making by practitioners who use the results of these models or investigate the decisions made by the systems. Practical benefits of clear decision making reasoning by complex computational models include reduced risk, increased productivity and revenue, appropriate adoption of technologies including improved education for practitioners, and improved outcomes for end users. Significant benefits will be demonstrated through the evaluations with practitioners in the areas of healthcare and energy.Read moreRead less
Information access through web-scale question-answer pair finding, ranking and matching. This project will aim to take web search to a new level of sophistication in accepting queries in the form of complex natural language questions, and returning a ranked list of natural language answers automatically extracted from a broad range of web user forums.
Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approa ....Automated assessment of data quality in biological knowledge resources. This project aims to develop methods for identifying poor quality data in biological databases. Research in biomedicine is underpinned by massive databases of biological data. Data quality is largely managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for assessing biological data quality, utilising an innovative approach based on network analysis of database record connectivity. These tools will enable quantifying data quality at scale. Researchers, evidence-based decision-makers in biomedicine, and the analytical or predictive tools that use this data will make more reliable inferences and decisions.Read moreRead less
Learning Deep Semantics for Automatic Translation between Human Languages. This project seeks to integrate deep linguistics and deep learning to improve translation quality. The modern world relies increasingly on automatic translation of human languages to deal with billions of documents. Current translation systems struggle with complex texts and often produce misleading or incoherent outputs. Furthermore, they translate sentences independently and ignore their overall document-wide context. T ....Learning Deep Semantics for Automatic Translation between Human Languages. This project seeks to integrate deep linguistics and deep learning to improve translation quality. The modern world relies increasingly on automatic translation of human languages to deal with billions of documents. Current translation systems struggle with complex texts and often produce misleading or incoherent outputs. Furthermore, they translate sentences independently and ignore their overall document-wide context. This project seeks to address these issues by developing a new approach using semantics – the underlying meaning of the text – to drive translation, both as discrete structures and continuous representations learned via deep learning. This may improve translation quality, thereby improving automatic translation for end-users.Read moreRead less
Exploiting Context in Multilingual Understanding and Generation. Automatic translation technologies produce incoherent and incorrect outputs in critical areas, such as health, finance, and law. This is due to translating sentences independently, without regard to the global extra-sentential context and rich linguistic structures inherent in the wider document context. This project aims to exploit global linguistic structures, capitalising on recent advances in deep neural networks, in order to g ....Exploiting Context in Multilingual Understanding and Generation. Automatic translation technologies produce incoherent and incorrect outputs in critical areas, such as health, finance, and law. This is due to translating sentences independently, without regard to the global extra-sentential context and rich linguistic structures inherent in the wider document context. This project aims to exploit global linguistic structures, capitalising on recent advances in deep neural networks, in order to generate coherent and faithful text. Expected outcome include next-generation computational technologies for language understanding and generation. This should significantly benefit document-based language technologies and increase their applications in a range of cultural, industrial, and health settings.Read moreRead less
Adaptive Context-Dependent Machine Translation for Heterogeneous Text. While automatic machine translation technologies are undoubtedly useful to a wide range of users, they often produce incoherent outputs for many types of input, for example, medical, literature, or even conversational text. This project will develop new adaptive machine translation systems to handle many domains and text styles, including heterogeneous mixed-domain inputs. It will develop multi-task machine learning methods f ....Adaptive Context-Dependent Machine Translation for Heterogeneous Text. While automatic machine translation technologies are undoubtedly useful to a wide range of users, they often produce incoherent outputs for many types of input, for example, medical, literature, or even conversational text. This project will develop new adaptive machine translation systems to handle many domains and text styles, including heterogeneous mixed-domain inputs. It will develop multi-task machine learning methods for training collections of domain-specific translation systems while leveraging correlations between domains. This approach will reduce the big data requirements of current translation systems, and improve translation quality across a wide range of different language pairs and application domains.Read moreRead less
Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce th ....Natural language processing for automated validation of protein databases. The project aims to use natural language processing and information retrieval to reconcile and improve sources of biological information. Biological research has produced vast volumes of information about proteins, captured in structured resources (databases) and unstructured documents. However, the accuracy of much of this information is questionable. The project proposes to develop methods to validate data and reduce the dramatic inconsistencies in protein information resources by leveraging observed correlations and complementarity between them, and specifically through targeted fact extraction from the biomedical literature. These methods will be applied at scale across millions of published articles, to infer and validate functional information.Read moreRead less
AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across user ....AI for Legal Problem Diagnosis in the Diverse Language of Australians. The number of Australians with unmet legal needs is estimated to be over four million people per year and growing, and free legal assistance is severely under-resourced. A bottleneck for free legal assistance providers is the determination of what (if any) specific legal needs the individual has, to which end this project proposes to develop AI models to semi-automate the process, with particular focus on fairness across users of all backgrounds, generalisation from small amounts of curated data, and dynamic interaction with the help-seeker. The project will help deliver legal assistance to some of the most vulnerable members of Australian society, and reinforce Australia's position as a world leader in AI for Law.Read moreRead less