Investigation and Development of Parallel Large Scale Record Linkage Techniques. Record linkage aims at matching records of the same entity (like customer or patient) in large (administrative) databases. The outcomes of the proposed research will improve current techniques in terms of efficiency, accuracy and the need for human intervention. Through experimental studies and stochastic modelling the performance of traditional and new methods for data cleaning, standardisation and linkage will be ....Investigation and Development of Parallel Large Scale Record Linkage Techniques. Record linkage aims at matching records of the same entity (like customer or patient) in large (administrative) databases. The outcomes of the proposed research will improve current techniques in terms of efficiency, accuracy and the need for human intervention. Through experimental studies and stochastic modelling the performance of traditional and new methods for data cleaning, standardisation and linkage will be assessed. The effect of the statistical dependency of attribute values will be studied. New methods using clustering for blocking large datasets, and predictive models including interaction terms will be implemented, analysed and evaluated on high-performance computers and office-based PC clusters.
Read moreRead less
Creating the social genome: Advanced techniques for linking dynamic data. This project aims to develop novel efficient and effective models and techniques that enable record linkage of large dynamic databases while preserving the privacy of sensitive personal data. Social genomes are the digital footprints of our society. They are the basis of population informatics, which is revolutionising how researchers in various domains conduct studies, governments plan services and expenditures, and busin ....Creating the social genome: Advanced techniques for linking dynamic data. This project aims to develop novel efficient and effective models and techniques that enable record linkage of large dynamic databases while preserving the privacy of sensitive personal data. Social genomes are the digital footprints of our society. They are the basis of population informatics, which is revolutionising how researchers in various domains conduct studies, governments plan services and expenditures, and businesses advertise and interact with their customers. A core requirement of population informatics is the linking of large dynamic databases that contain details about people from diverse sources. The expected outcomes of this project will provide novel solutions to the challenges of population informatics faced by Australian organisations.Read moreRead less
Industrial Transformation Training Centres - Grant ID: IC200100022
Funder
Australian Research Council
Funding Amount
$4,883,406.00
Summary
ARC Training Centre for Information Resilience. The proposed centre aims at building workforce capacity in Australian organisations to create, protect and sustain agile data pipelines, capable of detecting and responding to failures and risks across the information value chain in which the data is sourced, shared, transformed, analysed and consumed. Building on strong foundations of responsible data science, the centre will bring together end-users, technology providers, and cutting-edge researc ....ARC Training Centre for Information Resilience. The proposed centre aims at building workforce capacity in Australian organisations to create, protect and sustain agile data pipelines, capable of detecting and responding to failures and risks across the information value chain in which the data is sourced, shared, transformed, analysed and consumed. Building on strong foundations of responsible data science, the centre will bring together end-users, technology providers, and cutting-edge research, to lift the socio-technical barriers to data driven transformation and develop resilient data pipelines capable of delivering game-changing productivity gains that position Australian organisations at the forefront of technology leadership and value creation from data assets. Read moreRead less