ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0001-6713-7667

Current Organisations
CSIRO , Macquarie University , Australian National University , Data61

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

An Overview of Big Data Issues in Privacy-Preserving Record Linkage

Publisher: Springer International Publishing

Date: 2019

DOI: 10.1007/978-3-030-19759-9_8

Publication

P-Signature-Based Blocking to Improve the Scalability of Privacy-Preserving Record Linkage

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-66172-4_3

Publication

Privacy risk quantification in education data using Markov model

Publisher: Wiley

Date: 25-04-2022

DOI: 10.1111/BJET.13223

Abstract: With Big Data revolution, the education sector is being reshaped. The current data‐driven education system provides many opportunities to utilize the enormous amount of collected data about students' activities and performance for personalized education, adapting teaching methods, and decision making. On the other hand, such benefits come at a cost to privacy. For ex le, the identification of a student's poor performance across multiple courses. While several works have been conducted on quantifying the re‐identification risks of in iduals in released datasets, they assume an adversary's prior knowledge about target in iduals. Most of them do not utilize all the available information in the datasets. For ex le, event‐level information that associates multiple records to the same in idual and correlation between attributes. In this work, we propose a method using a Markov Model (MM) to quantify re‐identification risks using all available information in the data under a more realistic threat model that assumes different levels of an adversary's knowledge about the target in idual, ranging from any one of the attributes to all given attributes. Moreover, we propose a workflow for efficiently calculating MM risk which is highly scalable to large number of attributes. Experimental results from real education datasets show the efficacy of our model for re‐identification risk. What is already known about this topic? There has been a number of works/research conducted on privacy risk quantification in datasets and in the Web. Majority of them have strong assumption about adversary's prior knowledge of target in idual(s). Most of them do not utilize all the available information in the datasets, eg, event‐level or duplicate records and correlation between attributes. What this paper adds? This paper proposes a new re‐identification risk quantification model using Markov models. Our model addresses the shortcomings of existing works, eg, strong assumption about adversary's knowledge, unexplainable model, and utilizing available information in the datasets. Specifically, our proposed model not only focuses on the uniqueness of data points in the datasets (as most of the other existing methods), but also takes into account uniformity and correlation characteristics of these data points. Re‐identification risk quantification is computationally expensive and is not scalable to large datasets with increasing number of attributes. This paper introduces a workflow for data custodians to use to efficiently evaluate the worst‐case re‐identification risk in their datasets before releasing. It presents extensive experimental evaluation results of the proposed model for quantifying re‐identification risks on several real education datasets. Implications for practice and/or policy? Empirical results on real education datasets validate the significance and efficacy of the proposed model for re‐identification risk quantification compared to existing approaches. Our model can be used by the data custodians as a tool to evaluate the worst‐case risk of a dataset. It empowers data custodians to make informed decisions on appropriate actions to mitigate these risks (eg, data perturbation) before sharing or releasing their datasets to third parties. A typical use case would be one where the data custodians is an online course rogram provider, which collects data about students' engagement with their courses and would like to share it with third parties for them to run learning analytics that would provide value‐added benefits back to the data custodian. We specifically study the privacy risk quantification for education data however, our model is applicable to any tabular data release.

Publication

Precise and Fast Cryptanalysis for Bloom Filter Based Privacy-Preserving Record Linkage

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2019

DOI: 10.1109/TKDE.2018.2874004

Publication

Hyper-Parameter Optimization for Privacy-Preserving Record Linkage

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-65965-3_18

Publication

Privacy-Preserving Record Linkage for Cardinality Counting

Publisher: ACM

Date: 10-07-2023

DOI: 10.1145/3579856.3590338

Publication

Adversarial Attacks on Mobile Malware Detection

Publisher: IEEE

Date: 02-2019

DOI: 10.1109/AI4MOBILE.2019.8672711

Publication

A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-319-93040-4_2

Publication

A Privacy-Preserving Framework based Blockchain and Deep Learning for Protecting Smart Power Networks

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2019

DOI: 10.1109/TII.2019.2957140

Publication

P4Mobi: A Probabilistic Privacy-Preserving Framework for Publishing Mobility Datasets

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2020

DOI: 10.1109/TVT.2020.2994157

Publication

Local Differentially Private Fuzzy Counting in Stream Data Using Probabilistic Data Structures

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TKDE.2022.3198478

Publication

Privacy-Preserving Release of Energy Data

Publisher: CSIRO

Date: 2019

DOI: 10.25919/5D2635B90A8E4

Publication

Privacy-Preserving Schemes for Safeguarding Heterogeneous Data Sources in Cyber-Physical Systems

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/ACCESS.2021.3069737

Publication

Feature-Based Adversarial Attacks Against Machine Learnt Mobile Malware Detectors

Publisher: IEEE

Date: 25-11-2020

DOI: 10.1109/ITNAC50341.2020.9315144

Publication

Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage

Publisher: Elsevier BV

Date: 07-2020

DOI: 10.1016/J.DATAK.2020.101809

Publication

An Evaluation Framework for Privacy-Preserving Record Linkage

Publisher: Journal of Privacy and Confidentiality

Date: 06-2014

DOI: 10.29012/JPC.V6I1.636

Abstract: Privacy-preserving record linkage (PPRL) addresses the problem of identifying matching records from different databases that correspond to the same real-world entities using quasi-identifying attributes (in the absence of unique entity identifiers), while preserving privacy of these entities. Privacy is being preserved by not revealing any information that could be used to infer the actual values about the records that are not reconciled to the same entity (non-matches), and any confidential or sensitive information (that is not agreed upon by the data custodians) about the records that were reconciled to the same entity (matches) during or after the linkage process. The PPRL process often involves three main challenges, which are scalability to large databases, high linkage quality in the presence of data quality errors, and sufficient privacy guarantees. While many solutions have been developed for the PPRL problem over the past two decades, an evaluation and comparison framework of PPRL solutions with standard numerical measures defined for all three properties (scalability, linkage quality, and privacy) of PPRL has so far not been presented in the literature. We propose a general framework with normalized measures to practically evaluate and compare PPRL solutions in the face of linkage attack methods that are based on an external global dataset. We conducted experiments of several existing PPRL solutions on real-world databases using our proposed evaluation framework, and the results show that our framework provides an extensive and comparative evaluation of PPRL solutions in terms of the three properties.

Publication

DLforum – A multidisciplinary online discussion forum for data linkage researchers and practitioners

Publisher: Swansea University

Date: 20-02-2018

DOI: 10.23889/IJPDS.V3I1.420

Abstract: Data linkage, the process of identifying records that refer to the same entities across databases, is a crucial component of Population Data Science. Data linkage has a history going back over fifty years with many different methods and techniques being developed in various disciplines including computer science, statistics, and health informatics. Data linkage researchers and practitioners are commonly only familiar with methods and techniques that have been developed or are used in their own discipline, and they often only follow research that is being published at venues in their own discipline. There is currently no single online resource that allows data linkage researchers and practitioners across different disciplines to exchange ideas, post questions, or advertise new publications, software, open positions, or upcoming conferences and workshops. This leads to a communication gap in the multi-disciplinary field of data linkage. We aim to address this gap with the DLforum, a public online discussion forum for data linkage. DLforum contains several discussion areas, including publication announcements, resources (software and data sets), information about upcoming conferences and workshops, job opportunities, and general questions related to data linkage. The forum includes a moderation process where all registered users can post content and reply to posts by other users. We anticipate that the number of users registered and the amount of content posted in the forum will show that such an online forum is of value to data linkage researchers and practitioners from different disciplines to effectively communicate and exchange their knowledge, and thus form an online community of practice. In this paper we describe the methods of developing the DLforum, its structure and content, and our plan on how to evaluate the forum. The DLforum is freely available at: dmm.anu.edu.au/DLforum/

Publication

Sequence Data Matching and Beyond: New Privacy-Preserving Primitives Based on Bloom Filters

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2020

DOI: 10.1109/TIFS.2020.2980835

Publication

Fairness-Aware Privacy-Preserving Record Linkage

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-66172-4_1

Publication

Privacy Preserving Text Data Encoding and Topic Modelling

Publisher: IEEE

Date: 15-12-2021

DOI: 10.1109/BIGDATA52589.2021.9671552

Publication

Modern Privacy-Preserving Record Linkage Techniques: An Overview

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIFS.2021.3114026

Publication

Incognito: A method for obfuscating web data

Publisher: ACM Press

Date: 2018

DOI: 10.1145/3178876.3186093

Publication

Fairness and Cost Constrained Privacy-Aware Record Linkage

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TIFS.2022.3191492

Publication

Privacy-Preserving Record Linkage

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-319-63962-8_17-1

Publication

Privacy-Preserving Techniques for Protecting Large-Scale Data of Cyber-Physical Systems

Publisher: IEEE

Date: 12-2021

DOI: 10.1109/MSN50589.2020.00121

Related Organisations

Organisation

Commonwealth Scientific And Industrial Research Organisation

Location: Australia

View Organisation

Organisation

Aeturnum (Pvt) Ltd

Location: Sri Lanka

View Organisation

Organisation

CSIRO

Location: Australia

View Organisation

Organisation

University Of Colombo

Location: Sri Lanka

View Organisation

Organisation

Macquarie University

Location: Australia

View Organisation

Organisation

Australian National University

Location: Australia

View Organisation

Organisation

Data61

Location: Australia

View Organisation

Related Funding Activities

Grant

Advancing Data Integration: Privacy And Semantics For Record Linkage

Start Date: 2015

End Date: 2014

Funder: German Academic Exchange Service London

View Funded Activity

Grant

Endeavour Postgraduate Research Award

Start Date: 2011

End Date: 2014

Funder: Department of Education, Employment and Workplace Relations, Australian Government

View Funded Activity

Grant

Privacy-preserving Medical Data Linkage

Start Date: 2016

End Date: 2016

Funder: Google

View Funded Activity

Dinusha Vatsalan

Researcher

Publications

An Overview of Big Data Issues in Privacy-Preserving Record Linkage

P-Signature-Based Blocking to Improve the Scalability of Privacy-Preserving Record Linkage

Privacy risk quantification in education data using Markov model

Precise and Fast Cryptanalysis for Bloom Filter Based Privacy-Preserving Record Linkage

Hyper-Parameter Optimization for Privacy-Preserving Record Linkage

Privacy-Preserving Record Linkage for Cardinality Counting

Adversarial Attacks on Mobile Malware Detection

A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage

A Privacy-Preserving Framework based Blockchain and Deep Learning for Protecting Smart Power Networks

P4Mobi: A Probabilistic Privacy-Preserving Framework for Publishing Mobility Datasets

Local Differentially Private Fuzzy Counting in Stream Data Using Probabilistic Data Structures

Privacy-Preserving Release of Energy Data

Privacy-Preserving Schemes for Safeguarding Heterogeneous Data Sources in Cyber-Physical Systems

Feature-Based Adversarial Attacks Against Machine Learnt Mobile Malware Detectors

Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage

An Evaluation Framework for Privacy-Preserving Record Linkage

DLforum – A multidisciplinary online discussion forum for data linkage researchers and practitioners

Sequence Data Matching and Beyond: New Privacy-Preserving Primitives Based on Bloom Filters

Fairness-Aware Privacy-Preserving Record Linkage

Privacy Preserving Text Data Encoding and Topic Modelling

Modern Privacy-Preserving Record Linkage Techniques: An Overview

Incognito: A method for obfuscating web data

Fairness and Cost Constrained Privacy-Aware Record Linkage

Privacy-Preserving Record Linkage

Privacy-Preserving Techniques for Protecting Large-Scale Data of Cyber-Physical Systems

Related Organisations

Commonwealth Scientific And Industrial Research Organisation

Aeturnum (Pvt) Ltd

CSIRO

University Of Colombo

Macquarie University

Australian National University

Data61

Related Funding Activities

Advancing Data Integration: Privacy And Semantics For Record Linkage

Endeavour Postgraduate Research Award

Privacy-preserving Medical Data Linkage

Dinusha Vatsalan

Researcher

Related Links

Publications

An Overview of Big Data Issues in Privacy-Preserving Record Linkage

P-Signature-Based Blocking to Improve the Scalability of Privacy-Preserving Record Linkage

Privacy risk quantification in education data using Markov model

Precise and Fast Cryptanalysis for Bloom Filter Based Privacy-Preserving Record Linkage

Hyper-Parameter Optimization for Privacy-Preserving Record Linkage

Privacy-Preserving Record Linkage for Cardinality Counting

Adversarial Attacks on Mobile Malware Detection

A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage

A Privacy-Preserving Framework based Blockchain and Deep Learning for Protecting Smart Power Networks

P4Mobi: A Probabilistic Privacy-Preserving Framework for Publishing Mobility Datasets

Local Differentially Private Fuzzy Counting in Stream Data Using Probabilistic Data Structures

Privacy-Preserving Release of Energy Data

Privacy-Preserving Schemes for Safeguarding Heterogeneous Data Sources in Cyber-Physical Systems

Feature-Based Adversarial Attacks Against Machine Learnt Mobile Malware Detectors

Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage

An Evaluation Framework for Privacy-Preserving Record Linkage

DLforum – A multidisciplinary online discussion forum for data linkage researchers and practitioners

Sequence Data Matching and Beyond: New Privacy-Preserving Primitives Based on Bloom Filters

Fairness-Aware Privacy-Preserving Record Linkage

Privacy Preserving Text Data Encoding and Topic Modelling

Modern Privacy-Preserving Record Linkage Techniques: An Overview

Incognito: A method for obfuscating web data

Fairness and Cost Constrained Privacy-Aware Record Linkage

Privacy-Preserving Record Linkage

Privacy-Preserving Techniques for Protecting Large-Scale Data of Cyber-Physical Systems

Related Organisations

Commonwealth Scientific And Industrial Research Organisation

Aeturnum (Pvt) Ltd

CSIRO

University Of Colombo

Macquarie University

Australian National University

Data61

Related Funding Activities

Advancing Data Integration: Privacy And Semantics For Record Linkage

Endeavour Postgraduate Research Award

Privacy-preserving Medical Data Linkage

ARDC NEWSLETTER SIGNUP