Developing automated data cleansing and validation processes for fisheries catch and effort data

Funding Activity

Website
https://www.frdc.com.au/project/2017-085

Funding Status
Active

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the .

Funded Activity Summary

During a recent national Fisheries Statistics Working Group meeting, data managers from all Australian states highlighted and discussed the likely high prevalence of inaccurate or fraudulent data supplied by fishers and accrued through data-entry errors. Current data quality control measures in each jurisdiction are largely heterogeneous, undocumented and often rely on manual checks by clerks or analysts that are labour intensive and costly and not routinely executed. Because many of these checks occur during manual data entry of paper-based records, these are likely to become obsolete as reliance on electronic reporting increases, with data entered directly by fishers through online portals or mobile applications.

There is a need to develop automated data cleansing and diagnostic procedures that can be applied post-hoc or retrospectively to large fisheries databases to detect and flag errors and outliers and provide subsets of reliable catch and effort data for stock assessments and other analyses. This project will contribute towards addressing these issues, by developing automated processes to routinely assess newly entered fisheries catch and effort data for errors, retrospectively quantify error rates in existing data and assess their likely influence on the outputs of stock assessment analyses. The outcomes will help improve the quality and accuracy of catch and effort data used in routine stock assessments, and in turn lead to more sustainable management of wild capture fisheries resources.


Objectives:
1. Review existing data quality control and cleansing processes applied to fisheries catch and effort databases in all state and commonwealth jurisdictions.
2. Develop a suite of generic algorithmic and statistical approaches to detect and flag different error types (e.g., anomalous, missing and outlying values) in fisheries catch and effort relational databases.
3. Trial the above approaches with several case-study fisheries datasets to assess the performance of different data cleansing approaches, quantify error rates and types and assess the sensitivity of catch and effort statistics to these errors and outliers.
4. On the basis of the above findings, recommend a standard national approach for data cleansing and validation of fisheries catch and effort data.
5. Customise and integrate the generic approaches into NSW fisheries database systems to implement automated data cleansing processes.
6. Extend the results of the project to fishers and industry representatives to encourage greater accuracy in fisheries catch and effort data reporting.

Funded Activity Details

Start Date: 22-12-2017

End Date: 30-06-2020

Funding Scheme: Funding Scheme not available

Funding Amount: $397,750.00

Funder: Fisheries Research and Development Corporation

Research Topics

ANZSRC Field of Research (FoR)

There are no FoR codes available for this funding activity

ANZSRC Socio-Economic Objective (SEO)

There are no SEO codes available for this funding activity

Other Keywords

App | Automation | Data | E-commerce & market platforms | RAC NSW | RAC SA | RAC VIC | RAC WA | Supply Chain