ARDC Research Link Australia

Publication

Analysis of erroneous data entries in paper based and electronic data collection

Publisher: Research Square Platform LLC

Date: 12-09-2019

DOI: 10.21203/RS.2.11983/V2

Abstract: Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p .001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.

Publication

Analysis of erroneous data entries in paper based and electronic data collection

Publisher: Research Square Platform LLC

Date: 12-09-2019

DOI: 10.21203/RS.2.11983/V3

Abstract: Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p .001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.

Publication

Laboratory challenges of Plasmodium species identification in Aceh Province, Indonesia, a malaria elimination setting with newly discovered P. knowlesi

Publisher: Public Library of Science (PLoS)

Date: 30-11-2018

DOI: 10.1371/JOURNAL.PNTD.0006924

Publication

Analysis of erroneous data entries in paper based and electronic data collection

Publisher: Research Square Platform LLC

Date: 11-10-2019

DOI: 10.21203/RS.2.11983/V4

Abstract: Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p .001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.

Publication

The antimalarial MMV688533 provides potential for single-dose cures with a high barrier to Plasmodium falciparum parasite resistance

Publisher: American Association for the Advancement of Science (AAAS)

Date: 21-07-2021

DOI: 10.1126/SCITRANSLMED.ABG6013

Abstract: We report an acylguanidine preclinical candidate with pharmacological features compatible with single low-dose malaria cure.

Publication

A molecular barcode and online tool to identify and map imported infection with Plasmodium vivax

Publisher: Cold Spring Harbor Laboratory

Date: 24-09-2019

DOI: 10.1101/776781

Abstract: Imported cases present a considerable challenge to the elimination of malaria. Traditionally, patient travel history has been used to identify imported cases, but the long-latency liver stages confound this approach in Plasmodium vivax . Molecular tools to identify and map imported cases offer a more robust approach, that can be combined with drug resistance and other surveillance markers in high-throughput, population-based genotyping frameworks. Using a machine learning approach incorporating hierarchical FST (HFST) and decision tree (DT) analysis applied to 831 P. vivax genomes from 20 countries, we identified a 28-Single Nucleotide Polymorphism (SNP) barcode with high capacity to predict the country of origin. The Matthews correlation coefficient (MCC), which provides a measure of the quality of the classifications, ranging from −1 (total disagreement) to 1 (perfect prediction), exceeded 0.9 in 15 countries in cross-validation evaluations. When combined with an existing 37-SNP P. vivax barcode, the 65-SNP panel exhibits MCC scores exceeding 0.9 in 17 countries with up to 30% missing data. As a secondary objective, several genes were identified with moderate MCC scores (median MCC range from 0.54-0.68), amenable as markers for rapid testing using low-throughput genotyping approaches. A likelihood-based classifier framework was established, that supports analysis of missing data and polyclonal infections. To facilitate investigator-lead analyses, the likelihood framework is provided as a web-based, open-access platform (vivaxGEN-geo) to support the analysis and interpretation of data produced either at the 28-SNP core or full 65-SNP barcode. These tools can be used by malaria control programs to identify the main reservoirs of infection so that resources can be focused to where they are needed most.

Publication

Mondo: Unifying diseases for the world, by the world

Publisher: Cold Spring Harbor Laboratory

Date: 16-04-2022

DOI: 10.1101/2022.04.13.22273750

Abstract: There are thousands of distinct disease entities and concepts, each of which are known by different and sometimes contradictory names. The lack of a unified system for managing these entities poses a major challenge for both machines and humans that need to harmonize information to better predict causes and treatments for disease. The Mondo Disease Ontology is an open, community-driven ontology that integrates key medical and biomedical terminologies, supporting disease data integration to improve diagnosis, treatment, and translational research. Mondo records the sources of all data and is continually updated, making it suitable for research and clinical applications that require up-to-date disease knowledge.

Publication

Analysis of erroneous data entries in paper based and electronic data collection

Publisher: Research Square Platform LLC

Date: 30-08-2019

DOI: 10.21203/RS.2.11983/V1

Abstract: Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p .001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.

Publication

No association between thePlasmodium vivax crt-oMS334 or In9pvcrtpolymorphisms and chloroquine failure in a clinical cohort from Malaysia

Publisher: Cold Spring Harbor Laboratory

Date: 12-2022

DOI: 10.1101/2022.11.30.22282917

Abstract: Increasing reports of resistance to a frontline malaria blood-stage treatment, chloroquine (CQ), raise concerns for the elimination of Plasmodium vivax . The absence of an effective molecular marker of CQ resistance in P. vivax greatly constrains surveillance of this emerging threat. A recent genetic cross between CQ sensitive (CQS) and CQ resistant (CQR) NIH-1993 strains of P. vivax linked a moderate CQR phenotype with two candidate markers in P. vivax CQ resistance transporter gene ( pvcrt-o ): MS334 and In9 pvcrt . Longer TGAAGH motifs at MS334 were associated with CQ resistance, as were shorter motifs at the In9 pvcrt locus. In this study, high-grade CQR clinical isolates of P. vivax from Malaysia were used to investigate the association between the MS334 and In9 pvcrt variants and treatment efficacy. Amongst a total of 49 independent monoclonal P. vivax isolates assessed, high-quality MS334 and In9 pvcrt sequences could be derived from 30 (61%) and 23 (47%), respectively. Five MS334 and six In9 pvcrt alleles were observed, with allele frequencies ranging from 2 to 76% and 3 to 71%, respectively. None of the clinical isolates had the same variant as the NIH-1993 CQR strain, and none were associated with CQ treatment failure (all p .05). Our findings suggest that the pvcrt-o MS334 and In9 pvcrt markers cannot be used universally as markers of CQ treatment efficacy in an area of high-grade CQ resistance. Further studies applying hypothesis-free genome-wide approaches are warranted to identify more effective CQR markers for P. vivax .

Publication

A molecular barcode and web-based data analysis tool to identify imported Plasmodium vivax malaria

Publisher: Springer Science and Business Media LLC

Date: 23-12-2022

DOI: 10.1038/S42003-022-04352-2

Abstract: Traditionally, patient travel history has been used to distinguish imported from autochthonous malaria cases, but the dormant liver stages of Plasmodium vivax confound this approach. Molecular tools offer an alternative method to identify, and map imported cases. Using machine learning approaches incorporating hierarchical fixation index and decision tree analyses applied to 799 P. vivax genomes from 21 countries, we identified 33-SNP, 50-SNP and 55-SNP barcodes (GEO33, GEO50 and GEO55), with high capacity to predict the infection’s country of origin. The Matthews correlation coefficient (MCC) for an existing, commonly applied 38-SNP barcode (BR38) exceeded 0.80 in 62% countries. The GEO panels outperformed BR38, with median MCCs 0.80 in 90% countries at GEO33, and 95% at GEO50 and GEO55. An online, open-access, likelihood-based classifier framework was established to support data analysis (vivaxGEN-geo). The SNP selection and classifier methods can be readily amended for other use cases to support malaria control programs.

Jutta Marfurt

Researcher

Publications

Analysis of erroneous data entries in paper based and electronic data collection

Analysis of erroneous data entries in paper based and electronic data collection

Laboratory challenges of Plasmodium species identification in Aceh Province, Indonesia, a malaria elimination setting with newly discovered P. knowlesi

Analysis of erroneous data entries in paper based and electronic data collection

The antimalarial MMV688533 provides potential for single-dose cures with a high barrier to Plasmodium falciparum parasite resistance

A molecular barcode and online tool to identify and map imported infection with Plasmodium vivax

Mondo: Unifying diseases for the world, by the world

Analysis of erroneous data entries in paper based and electronic data collection

No association between thePlasmodium vivax crt-oMS334 or In9pvcrtpolymorphisms and chloroquine failure in a clinical cohort from Malaysia

A molecular barcode and web-based data analysis tool to identify imported Plasmodium vivax malaria

Related Organisations

University Of Basel

Menzies School Of Health Research

Swiss Federal Office Of Public Health

Swiss Tropical And Public Health Institute (Swiss TPH)

University Hospitals Of Basel

University Of London

Related Funding Activities

Jutta Marfurt

Researcher

Related Links

Publications

Analysis of erroneous data entries in paper based and electronic data collection

Analysis of erroneous data entries in paper based and electronic data collection

Laboratory challenges of Plasmodium species identification in Aceh Province, Indonesia, a malaria elimination setting with newly discovered P. knowlesi

Analysis of erroneous data entries in paper based and electronic data collection

The antimalarial MMV688533 provides potential for single-dose cures with a high barrier to Plasmodium falciparum parasite resistance

A molecular barcode and online tool to identify and map imported infection with Plasmodium vivax

Mondo: Unifying diseases for the world, by the world

Analysis of erroneous data entries in paper based and electronic data collection

No association between thePlasmodium vivax crt-oMS334 or In9pvcrtpolymorphisms and chloroquine failure in a clinical cohort from Malaysia

A molecular barcode and web-based data analysis tool to identify imported Plasmodium vivax malaria

Related Organisations

University Of Basel

Menzies School Of Health Research

Swiss Federal Office Of Public Health

Swiss Tropical And Public Health Institute (Swiss TPH)

University Hospitals Of Basel

University Of London

Related Funding Activities

ARDC NEWSLETTER SIGNUP