ARDC Research Link Australia

Publication

An Information Retrieval Experiment Framework for Domain Specific Applications

Publisher: ACM

Date: 27-06-2018

DOI: 10.1145/3209978.3210167

Publication

Advances in Formal Models of Search and Search Behaviour

Publisher: ACM

Date: 12-09-2016

DOI: 10.1145/2970398.2970440

Publication

Outcome-based Evaluation of Systematic Review Automation

Publisher: ACM

Date: 09-08-2023

DOI: 10.1145/3578337.3605135

Publication

The interactive PRP for diversifying document rankings

Publisher: ACM

Date: 24-07-2011

DOI: 10.1145/2009916.2010132

Publication

Representing EHRs with Temporal Tree and Sequential Pattern Mining for Similarity Computing

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-65390-3_18

Publication

Automatic query expansion: A structural linguistic perspective

Publisher: Wiley

Date: 26-02-2014

DOI: 10.1002/ASI.23065

Publication

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction

Publisher: ACM

Date: 27-02-2023

DOI: 10.1145/3539597.3573025

Publication

Information retrieval as semantic inference: a Graph Inference model applied to medical search

Publisher: Springer Science and Business Media LLC

Date: 20-11-2016

DOI: 10.1007/S10791-015-9268-9

Publication

Term associations in query expansion: a structural linguistic perspective

Publisher: ACM Press

Date: 2013

DOI: 10.1145/2505515.2507852

Publication

Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search

Publisher: ACM

Date: 15-12-2022

DOI: 10.1145/3572960.3572980

Publication

Robustness of Neural Rankers to Typos: A Comparative Study

Publisher: ACM

Date: 15-12-2022

DOI: 10.1145/3572960.3572981

Publication

External knowledge and query strategies in active learning: A study in clinical information extraction

Publisher: ACM

Date: 17-10-2015

DOI: 10.1145/2806416.2806550

Publication

Pseudo-Relevance Feedback with Dense Retrievers in Pyserini

Publisher: ACM

Date: 15-12-2022

DOI: 10.1145/3572960.3572982

Publication

Top-k Retrieval Using Facility Location Analysis

Publisher: Springer Berlin Heidelberg

Date: 2012

DOI: 10.1007/978-3-642-28997-2_26

Publication

Federated Online Learning to Rank with Evolution Strategies: A Reproducibility Study

Publisher: Springer International Publishing

Date: 2021

DOI: 10.1007/978-3-030-72240-1_10

Publication

Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?

Publisher: ACM

Date: 18-07-2023

DOI: 10.1145/3539618.3591703

Publication

Building and Using Models of Information Seeking, Search and Retrieval

Publisher: ACM

Date: 09-08-2015

DOI: 10.1145/2766462.2767874

Publication

Exploring the Representation Power of SPLADE Models

Publisher: ACM

Date: 09-08-2023

DOI: 10.1145/3578337.3605129

Publication

Ranking Health Web Pages with Relevance and Understandability

Publisher: ACM

Date: 07-07-2016

DOI: 10.1145/2911451.2914741

Publication

Active learning: A step towards automating medical concept extraction

Publisher: Oxford University Press (OUP)

Date: 07-08-2016

DOI: 10.1093/JAMIA/OCV069

Abstract: Objective This paper presents an automatic, active learning-based system for the extraction of medical concepts from clinical free-text reports. Specifically, (1) the contribution of active learning in reducing the annotation effort and (2) the robustness of incremental active learning framework across different selection criteria and data sets are determined. Materials and methods The comparative performance of an active learning framework and a fully supervised approach were investigated to study how active learning reduces the annotation effort while achieving the same effectiveness as a supervised approach. Conditional random fields as the supervised method, and least confidence and information density as 2 selection criteria for active learning framework were used. The effect of incremental learning vs standard learning on the robustness of the models within the active learning framework with different selection criteria was also investigated. The following 2 clinical data sets were used for evaluation: the Informatics for Integrating Biology and the Bedside/Veteran Affairs (i2b2/VA) 2010 natural language processing challenge and the Shared Annotated Resources/Conference and Labs of the Evaluation Forum (ShARe/CLEF) 2013 eHealth Evaluation Lab. Results The annotation effort saved by active learning to achieve the same effectiveness as supervised learning is up to 77%, 57%, and 46% of the total number of sequences, tokens, and concepts, respectively. Compared with the random s ling baseline, the saving is at least doubled. Conclusion Incremental active learning is a promising approach for building effective and robust medical concept extraction models while significantly reducing the burden of manual annotation.

Publication

Document ranking with quantum probabilities

Publisher: Association for Computing Machinery (ACM)

Date: 07-06-2012

DOI: 10.1145/2492189.2492206

Abstract: In this thesis we investigate the use of quantum probability theory for ranking documents. Quantum probability theory is used to estimate the probability of relevance of a document given a user's query. We posit that quantum probability theory can lead to a better estimation of the probability of a document being relevant to a user's query than the common IR approach, i. e. the Probability Ranking Principle (PRP), which is based upon Kolmogorovian probability theory. Following our hypothesis, we formulate an analogy between the document retrieval scenario and a physical scenario, that of the double slit experiment. Through the analogy, we propose a novel ranking approach, the quantum probability ranking principle (qPRP). Key to our proposal is the presence of quantum interference. Mathematically, this is the statistical deviation between empirical observations and expected values predicted by the Kolmogorovian rule of additivity of probabilities of disjoint events in configurations such that of the double slit experiment. While PRP explicitly assumes that the relevancy of a document is independent of that of other documents, we suggest that qPRP implicitly models interdependent document relevance through quantum interference and thus is suited to those document ranking tasks where the independence assumption fails. Throughout the thesis, we also suggest how quantum interference can be estimated for effective document ranking. To validate our proposal and to gain more insights about approaches for document ranking, we (1) analyse PRP, qPRP and other ranking approaches, exposing the assumptions underlying their ranking criteria and formulating the conditions for the optimality of the two ranking principles, (2) empirically compare three ranking principles (i. e. PRP, interactive PRP, and qPRP) and two state-of-the-art ranking strategies in two retrieval scenarios, those of ad-hoc retrieval and ersity retrieval, (3) analytically contrast the ranking criteria of the examined approaches, exposing similarities and differences, (4) study the ranking behaviours of approaches alternative to PRP in terms of the kinematics they impose on relevant documents, i. e. by considering the extent and direction of the movements of relevant documents across the ranking recorded when comparing PRP against its alternatives. Our findings show that the effectiveness of the examined ranking approaches strongly depends upon the evaluation context. In the traditional evaluation context of ad-hoc retrieval, PRP is empirically shown to be better than or comparable to alternative ranking approaches. However, when evaluation contexts that account for interdependent document relevance are examined (i. e. when the relevance of a document is assessed also with respect to other retrieved documents, as it is the case in the ersity retrieval scenario), the use of quantum probability theory and thus of qPRP is shown to improve retrieval and ranking effectiveness over the traditional PRP and alternative ranking strategies, such as Maximal Marginal Relevance, Portfolio theory, and Interactive PRP. This work represents a significant step forward regarding the use of quantum theory in information retrieval. It demonstrates that the application of quantum theory to problems within information retrieval can lead to improvements both in modelling power and retrieval effectiveness, allowing the constructions of models that capture the complexity of information retrieval situations. Furthermore, the thesis opens up a number of lines of future research. These include investigating estimations and approximations of quantum interference in qPRP, exploiting complex numbers for the representation of documents and queries, and applying the concepts underlying qPRP to tasks other than document ranking. This dissertation was completed at School of Computing Science, University of Glasgow under the advise of Dr. Leif Azzopardi and Prof. Keith van Rijsbergen. Prof. Norbert Fuhr, Dr. Iadh Ounis, and Dr. John O'Donnell served as dissertation committee members. For the full dissertation, visit: theses.gla.ac.uk/3463.

Publication

Generalizing Translation Models in the Probabilistic Relevance Framework

Publisher: ACM

Date: 24-10-2016

DOI: 10.1145/2983323.2983833

Publication

Estimating interference in the QPRP for subtopic retrieval

Publisher: ACM

Date: 19-07-2010

DOI: 10.1145/1835449.1835593

Publication

Fixed budget pooling strategies based on fusion methods

Publisher: ACM

Date: 03-04-2017

DOI: 10.1145/3019612.3019692

Publication

Integrating the Framing of Clinical Questions via PICO into the Retrieval of Medical Literature for Systematic Reviews

Publisher: ACM

Date: 06-11-2017

DOI: 10.1145/3132847.3133080

Publication

Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models

Publisher: ACM

Date: 09-08-2023

DOI: 10.1145/3578337.3605121

Publication

AgAsk: A Conversational Search Agent for Answering Agricultural Questions

Publisher: ACM

Date: 27-02-2023

DOI: 10.1145/3539597.3573034

Publication

Quantum haystacks revisited

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-28940-3

Publication

On the use of Complex Numbers in Quantum Models for Information Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-23318-0_36

Publication

Using Emotion to Diversify Document Rankings

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-23318-0_34

Publication

A task completion framework to support single-interaction IR research

Publisher: Emerald

Date: 24-01-2018

DOI: 10.1108/JD-09-2017-0128

Abstract: A conceptual model describes important factors within a system and how they relate to one another. They are important because they help to identify system changes that can yield the greatest improvement. Within information retrieval (IR), most research is directed towards multi-document retrieval and a multi-interaction IR user scenario. There are few, if any, IR conceptual models supporting minimal or single-interaction IR (siIR) user scenarios, however the need for siIR systems is growing rapidly. The purpose of this paper is to take the first step towards constructing a task-oriented conceptual model and experimental framework to support siIR research. A first principles approach is employed to develop a task-oriented conceptual model, called bridging information retrieval (BIR). This model is contrasted with the concept of relevance, a central factor within IR research. BIR introduces the central concept of bridging information (BI) as the objective of IR systems. BI is the additional information a user requires to complete a task, beyond their innate knowledge. The relationship between BI and relevance is determined. The theoretical basis of BIR is derived axiomatically however the resulting system evaluation model is speculative. The proposed operational framework offers researchers a systematic approach to designing and evaluating siIR systems. This work contributes a novel task-oriented IR conceptual model and evaluation framework, both centred around the concept of BI for siIR. It also contributes a novel search task classification method.

Publication

Dependency-aware Self-training for Entity Alignment

Publisher: ACM

Date: 27-02-2023

DOI: 10.1145/3539597.3570370

Publication

A Query-Basis Approach to Parametrizing Novelty-Biased Cumulative Gain

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-23318-0_32

Publication

Consumer Health Search on the Web: Study of Web Page Understandability and Its Integration in Ranking Algorithms (Preprint)

Publisher: JMIR Publications Inc.

Date: 07-05-2018

DOI: 10.2196/PREPRINTS.10986

Abstract: nderstandability plays a key role in ensuring that people accessing health information are capable of gaining insights that can assist them with their health concerns and choices. The access to unclear or misleading information has been shown to negatively impact the health decisions of the general public. he aim of this study was to investigate methods to estimate the understandability of health Web pages and use these to improve the retrieval of information for people seeking health advice on the Web. ur investigation considered methods to automatically estimate the understandability of health information in Web pages, and it provided a thorough evaluation of these methods using human assessments as well as an analysis of preprocessing factors affecting understandability estimations and associated pitfalls. Furthermore, lessons learned for estimating Web page understandability were applied to the construction of retrieval methods, with specific attention to retrieving information understandable by the general public. e found that machine learning techniques were more suitable to estimate health Web page understandability than traditional readability formulae, which are often used as guidelines and benchmark by health information providers on the Web (larger difference found for Pearson correlation of .602 using gradient boosting regressor compared with .438 using Simple Measure of Gobbledygook Index with the Conference and Labs of the Evaluation Forum eHealth 2015 collection). he findings reported in this paper are important for specialized search services tailored to support the general public in seeking health advice on the Web, as they document and empirically validate state-of-the-art techniques and settings for this domain application.

Publication

Task-oriented search for evidence-based medicine

Publisher: Springer Science and Business Media LLC

Date: 03-2017

DOI: 10.1007/S00799-017-0209-7

Publication

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

Publisher: Springer International Publishing

Date: 2022

DOI: 10.1007/978-3-030-99736-6_40

Publication

An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank Systems

Publisher: ACM

Date: 09-08-2023

DOI: 10.1145/3578337.3605117

Publication

Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study

Publisher: ACM

Date: 18-07-2023

DOI: 10.1145/3539618.3591915

Publication

Generating Better Queries for Systematic Reviews

Publisher: ACM

Date: 27-06-2018

DOI: 10.1145/3209978.3210020

Publication

Crowdsourcing interactions: using crowdsourcing for evaluating interactive information retrieval systems

Publisher: Springer Science and Business Media LLC

Date: 13-07-2012

DOI: 10.1007/S10791-012-9206-Z

Publication

Assessors Agreement: A Case Study Across Assessor Type, Payment Levels, Query Variations and Relevance Dimensions

Publisher: Springer International Publishing

Date: 2016

DOI: 10.1007/978-3-319-44564-9_4

Publication

Combining Word Semantics within Complex Hilbert Space for Information Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2014

DOI: 10.1007/978-3-642-54943-4_14

Publication

Impact of a Search Engine on Clinical Decisions Under Time and System Effectiveness Constraints: Research Protocol (Preprint)

Publisher: JMIR Publications Inc.

Date: 13-11-2018

DOI: 10.2196/PREPRINTS.12803

Abstract: any clinical questions arise during patient encounters that clinicians are unable to answer. An evidence-based medicine approach expects that clinicians will seek and apply the best available evidence to answer clinical questions. One commonly used source of such evidence is scientific literature, such as that available through MEDLINE and PubMed. Clinicians report that 2 key reasons why they do not use search systems to answer questions is that it takes too much time and that they do not expect to find a definitive answer. So, the question remains about how effectively scientific literature search systems support time-pressured clinicians in making better clinical decisions. The results of this study are important because they can help clinicians and health care organizations to better assess their needs with respect to clinical decision support (CDS) systems and evidence sources. The results and data captured will contribute a significant data collection to inform the design of future CDS systems to better meet the needs of time-pressured, practicing clinicians. he purpose of this study is to understand the impact of using a scientific medical literature search system on clinical decision making. Furthermore, to understand the impact of realistic time pressures on clinicians, we vary the search time available to find clinical answers. Finally, we assess the impact of improvements in search system effectiveness on the same clinical decisions. n this study, 96 practicing clinicians and final year medical students are presented with 16 clinical questions which they must answer without access to any external resource. The same questions are then represented to the clinicians however, in this part of the study, the clinicians can use a scientific literature search engine to find evidence to support their answers. The time pressures of practicing clinicians are simulated by limiting answer time to one of 3, 6, or 9 min per question. The correct answer rate is reported both before and after search to assess the impact of the search system and the time constraint. In addition, 2 search systems that use the same user interface, but which vary widely in their search effectiveness, are employed so that the impact of changes in search system effectiveness on clinical decision making can also be assessed. ecruiting began for the study in June 2018. As of the April 4, 2019, there were 69 participants enrolled. The study is expected to close by May 30, 2019, with results to be published in July. ll data collected in this study will be made available at the University of Queensland’s UQ eSpace public data repository. ERR1-10.2196/12803

Publication

Graph-based concept weighting for medical information retrieval

Publisher: ACM

Date: 05-12-2012

DOI: 10.1145/2407085.2407096

Publication

Fixed-Cost Pooling Strategies

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2021

DOI: 10.1109/TKDE.2019.2947049

Publication

Diagnose This If You Can

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-16354-3_62

Publication

Overview of the ShARe/CLEF eHealth Evaluation Lab 2014

Publisher: Springer International Publishing

Date: 2014

DOI: 10.1007/978-3-319-11382-1_17

Publication

Payoffs and pitfalls in using knowledge-bases for consumer health search

Publisher: Springer Science and Business Media LLC

Date: 08-11-2019

DOI: 10.1007/S10791-018-9344-Z

Publication

Overview of the CLEF eHealth Evaluation Lab 2016

Publisher: Springer International Publishing

Date: 2016

DOI: 10.1007/978-3-319-44564-9_24

Publication

Causality Discovery with Domain Knowledge for Drug-Drug Interactions Discovery

Publisher: Springer International Publishing

Date: 2019

DOI: 10.1007/978-3-030-35231-8_46

Publication

Choices in Knowledge-Base Retrieval for Consumer Health Search

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-319-76941-7_6

Publication

Recursive module extraction using Louvain and PageRank

Publisher: F1000 Research Ltd

Date: 14-08-2018

DOI: 10.12688/F1000RESEARCH.15845.1

Abstract: Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.

Publication

Is the unigram relevance model term independent? Classifying term dependencies in query expansion

Publisher: ACM

Date: 05-12-2012

DOI: 10.1145/2407085.2407102

Publication

Clinical information extraction using small data: An active learning approach based on sequence representations and word embeddings

Publisher: Wiley

Date: 18-09-2017

DOI: 10.1002/ASI.23936

Publication

An Analysis of Ranking Principles and Retrieval Strategies

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-23318-0_15

Publication

Exploiting inference from semantic annotations for information retrieval: Reflections from medical IR

Publisher: ACM

Date: 07-11-2014

DOI: 10.1145/2663712.2666197

Publication

A comprehensive analysis of parameter settings for novelty-biased cumulative gain

Publisher: ACM

Date: 29-10-2012

DOI: 10.1145/2396761.2398550

Publication

A test collection for evaluating retrieval of studies for inclusion in systematic reviews

Publisher: ACM

Date: 07-08-2017

DOI: 10.1145/3077136.3080707

Publication

Has portfolio theory got any principles?

Publisher: ACM

Date: 19-07-2010

DOI: 10.1145/1835449.1835600

Publication

On the Volatility of Commercial Search Engines and its Impact on Information Retrieval Research

Publisher: ACM

Date: 27-06-2018

DOI: 10.1145/3209978.3210088

Publication

When Two Is Better Than One: A Study of Ranking Paradigms and Their Integrations for Subtopic Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-17187-1_15

Publication

A Formalization of Logical Imaging for Information Retrieval Using Quantum Theory

Publisher: IEEE

Date: 09-2008

DOI: 10.1109/DEXA.2008.69

Publication

Deep Query Likelihood Model for Information Retrieval

Publisher: Springer International Publishing

Date: 2021

DOI: 10.1007/978-3-030-72240-1_49

Publication

Quality Matters: Understanding the Impact of Incomplete Data on Visualization Recommendation

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-59003-1_8

Publication

Query Variation Performance Prediction for Systematic Reviews

Publisher: ACM

Date: 27-06-2018

DOI: 10.1145/3209978.3210078

Publication

Efficient Diversification for Recommending Aggregate Data Visualizations

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/ACCESS.2023.3283457

Publication

You Can Teach an Old Dog New Tricks: Rank Fusion applied to Coordination Level Matching for Ranking in Systematic Reviews

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-45439-5_27

Publication

Extracting Cancer Mortality Statistics from Death Certificates: A Hybrid Machine Learning and Rule-based Approach for Common and Rare Cancers

Publisher: Elsevier BV

Date: 07-2018

DOI: 10.1016/J.ARTMED.2018.04.011

Abstract: Death certificates are an invaluable source of cancer mortality statistics. However, this value can only be realised if accurate, quantitative data can be extracted from certificates-an aim h ered by both the volume and variable quality of certificates written in natural language. This paper proposes an automatic classification system for identifying all cancer related causes of death from death certificates. Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. The features were used as input to two different classification sub-systems: a machine learning sub-system using Support Vector Machines (SVMs) and a rule-based sub-system. A fusion sub-system then combines the results from SVMs and rules into a single final classification. A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. The system was highly effective at determining the type of cancers for both common cancers (F-measure of 0.85) and rare cancers (F-measure of 0.7). In general, rules performed superior to SVMs however, the fusion method that combined the two was the most effective. The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.

Publication

Counterfactual Online Learning to Rank

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-45439-5_28

Publication

An Analysis of the Cost and Benefit of Search Interactions

Publisher: ACM

Date: 12-09-2016

DOI: 10.1145/2970398.2970412

Publication

A Computational Approach for Objectively Derived Systematic Review Search Strategies

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-45439-5_26

Publication

Revisiting Sub–topic Retrieval in the ImageCLEF 2009 Photo Retrieval Task

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-15181-1_15

Publication

Overview of the CLEF eHealth Evaluation Lab 2018

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-319-98932-7_26

Publication

SIGIR 2017 Tutorial on Health Search (HS2017)

Publisher: ACM

Date: 07-08-2017

DOI: 10.1145/3077136.3082061

Publication

Effective User Relevance Feedback for Image Retrieval with Image Signatures

Publisher: ACM

Date: 05-12-2016

DOI: 10.1145/3015022.3015034

Publication

Economic Models of Interaction

Publisher: Oxford University Press

Date: 22-03-2018

DOI: 10.1093/OSO/9780198799603.003.0012

Abstract: This chapter provides a tutorial on how economics can be used to model the interaction between users and systems. Economic theory provides an intuitive and natural way to model Human-Computer Interaction which enables the prediction and explanation of user behaviour. A central tenet of the approach is the utility maximisation paradigm where it is assumed that users seek to maximise their profit/benefit subject to budget and other constraints when interacting with a system. By using such models it is possible to reason about user behaviour and make predictions about how changes to the interface or the users interactions will affect performance and behaviour. In this chapter, we describe and develop several economic models relating to how users search for information. While the ex les are specific to Information Seeking and Retrieval, the techniques employed can be applied more generally to other human-computer interaction scenarios. Therefore, the goal of this chapter is to provide an introduction and overview of how to build economic models of human-computer interaction that generate testable hypotheses regarding user behaviour which can be used to guide design and inform experimentation.

Publication

Seed-Driven Document Ranking for Systematic Reviews: A Reproducibility Study

Publisher: Springer International Publishing

Date: 2022

DOI: 10.1007/978-3-030-99736-6_46

Publication

CLEF 2017 eHealth Evaluation Lab Overview

Publisher: Springer International Publishing

Date: 2017

DOI: 10.1007/978-3-319-65813-1_26

Publication

Using the Quantum Probability Ranking Principle to Rank Interdependent Documents

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-12275-0_32

Publication

Diagnosis Ranking with Knowledge Graph Convolutional Networks

Publisher: Springer International Publishing

Date: 2021

DOI: 10.1007/978-3-030-72113-8_24

Publication

Combining Word Semantics within Complex Hilbert Space for Information Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2014

DOI: 10.1007/978-3-662-45912-6_14

Publication

University of Glasgow at ImageCLEFPhoto 2009: Optimising Similarity and Diversity in Image Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-15751-6_14

Publication

Exploiting Medical Hierarchies for Concept-based Information Retrieval

Publisher: ACM

Date: 05-12-2012

DOI: 10.1145/2407085.2407100

Publication

CLASSIFICATION OF CANCER-RELATED DEATH CERTIFICATES USING MACHINE LEARNING

Publisher: Scitechnol Biosoft Pvt. Ltd.

Date: 2013

DOI: 10.21767/AMJ.2013.1654

Publication

Approximate Nearest-Neighbour Search with Inverted Signature Slice Lists

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-16354-3_16

Publication

Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

Publisher: ACM

Date: 18-07-2023

DOI: 10.1145/3539618.3591952

Publication

SIGIR 2018 Tutorial on Health Search (HS2018)

Publisher: ACM

Date: 27-06-2018

DOI: 10.1145/3209978.3210188

Publication

Oyster: A Tool for Fine-Grained Ontological Annotations in Free-Text

Publisher: Springer International Publishing

Date: 2016

DOI: 10.1007/978-3-319-28940-3_39

Publication

AUTOMATED CLASSIFICATION OF LIMB FRACTURES FROM FREE-TEXT RADIOLOGY REPORTS USING A CLINICIAN-INFORMED GAZETTEER METHODOLOGY

Publisher: Scitechnol Biosoft Pvt. Ltd.

Date: 2013

DOI: 10.21767/AMJ.2013.1651

Publication

Boosting Titles does not Generally Improve Retrieval Effectiveness

Publisher: ACM

Date: 05-12-2016

DOI: 10.1145/3015022.3015028

Publication

Impact of water-quality conditions in source reservoirs on the optimal operation of a regional multiquality water-distribution system

Publisher: American Society of Civil Engineers (ASCE)

Date: 10-2015

DOI: 10.1061/(ASCE)WR.1943-5452.0000523

Publication

Back to the Roots: Mean-Variance Analysis of Relevance Estimations

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-20161-5_78

Publication

A Signature Approach to Patent Classification

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-28940-3_35

Publication

Fixed-Cost Pooling Strategies Based on IR Evaluation Measures

Publisher: Springer International Publishing

Date: 2017

DOI: 10.1007/978-3-319-56608-5_28

Publication

An Analysis of Theories of Search and Search Behavior

Publisher: No publisher found

Date: 2015

DOI: 10.1145/2808194.2809447

Publication

The Task: Distinguishing Tasks and Sessions in Legal Information Retrieval

Publisher: ACM

Date: 15-12-2022

DOI: 10.1145/3572960.3572983

Publication

Revisiting logical imaging for information retrieval

Publisher: ACM

Date: 19-07-2009

DOI: 10.1145/1571941.1572118

Publication

Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

Publisher: Association for Computing Machinery (ACM)

Date: 10-04-2023

DOI: 10.1145/3570724

Abstract: Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly with emergent deep language models. This article addresses this gap by investigating methods for integrating PRF signals with rerankers and dense retrievers based on deep language models. We consider text-based, vector-based and hybrid PRF approaches and investigate different ways of combining and scoring relevance signals. An extensive empirical evaluation was conducted across four different datasets and two task settings (retrieval and ranking). Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets. We found that the best effectiveness was achieved when (i) directly concatenating each PRF passage with the query, searching with the new set of queries, and then aggregating the scores (ii) using Borda to aggregate scores from PRF runs. Vector-based PRF results show that the use of PRF enhanced the effectiveness of deep rerankers and dense retrievers over several evaluation metrics. We found that higher effectiveness was achieved when (i) the query retains either the majority or the same weight within the PRF mechanism, and (ii) a shallower PRF signal (i.e., a smaller number of top-ranked passages) was employed, rather than a deeper signal. Our vector-based PRF method is computationally efficient thus, this represents a general PRF method others can use with deep rerankers and dense retrievers.

Publication

Active learning reduces annotation time for clinical concept extraction

Publisher: Elsevier BV

Date: 10-2017

DOI: 10.1016/J.IJMEDINF.2017.08.001

Abstract: To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random s ling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random s ling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations.

Publication

The Influence of Pre-processing on the Estimation of Readability of Web Documents

Publisher: ACM

Date: 17-10-2015

DOI: 10.1145/2806416.2806613

Publication

ADCS reaches adulthood: an analysis of the conference and its community over the last eighteen years

Publisher: ACM

Date: 05-12-2013

DOI: 10.1145/2537734.2537741

Publication

Efficient Top-K Retrieval with Signatures

Publisher: ACM

Date: 05-12-2013

DOI: 10.1145/2537734.2537742

Publication

The Quantum Probability Ranking Principle for Information Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2009

DOI: 10.1007/978-3-642-04417-5_21

Publication

Impact of a Search Engine on Clinical Decisions Under Time and System Effectiveness Constraints: Research Protocol

Publisher: JMIR Publications Inc.

Date: 28-05-2019

DOI: 10.2196/12803

Publication

Consumer Health Search on the Web: Study of Web Page Understandability and Its Integration in Ranking Algorithms

Publisher: JMIR Publications Inc.

Date: 30-01-2019

DOI: 10.2196/10986

Publication

An evaluation of corpus-driven measures of medical concept similarity for information retrieval

Publisher: ACM

Date: 29-10-2012

DOI: 10.1145/2396761.2398661

Publication

Reinforcement online learning to rank with unbiased reward shaping

Publisher: Springer Science and Business Media LLC

Date: 04-08-2022

DOI: 10.1007/S10791-022-09413-Y

Abstract: Online learning to rank (OLTR) aims to learn a ranker directly from implicit feedback derived from users’ interactions, such as clicks. Clicks however are a biased signal: specifically, top-ranked documents are likely to attract more clicks than documents down the ranking (position bias). In this paper, we propose a novel learning algorithm for OLTR that uses reinforcement learning to optimize rankers: Reinforcement Online Learning to Rank (ROLTR). In ROLTR, the gradients of the ranker are estimated based on the rewards assigned to clicked and unclicked documents. In order to de-bias the users’ position bias contained in the reward signals, we introduce unbiased reward shaping functions that exploit inverse propensity scoring for clicked and unclicked documents. The fact that our method can also model unclicked documents provides a further advantage in that less users interactions are required to effectively train a ranker, thus providing gains in efficiency. Empirical evaluation on standard OLTR datasets shows that ROLTR achieves state-of-the-art performance, and provides significantly better user experience than other OLTR approaches. To facilitate the reproducibility of our experiments, we make all experiment code available at elab/OLTR .

Publication

Discriminative Features Generation for Mortality Prediction in ICU

Publisher: Springer International Publishing

Date: 2020

DOI: 10.1007/978-3-030-65390-3_25

Publication

Understandability Biased Evaluation for Information Retrieval

Publisher: Springer International Publishing

Date: 2016

DOI: 10.1007/978-3-319-30671-1_21

Publication

The Impact of Fixed-Cost Pooling Strategies on Test Collection Bias

Publisher: ACM

Date: 12-09-2016

DOI: 10.1145/2970398.2970429

Publication

Causality Discovery Based on Combined Causes and Multiple Causes in Drug-Drug Interaction

Publisher: Springer Nature Switzerland

Date: 2022

DOI: 10.1007/978-3-031-22064-7_5

Publication

Document Timespan Normalisation and Understanding Temporality for Clinical Records Search

Publisher: ACM

Date: 26-11-2014

DOI: 10.1145/2682862.2682879

Guido Zuccon

Researcher

Research Topics

Top 4 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

An Information Retrieval Experiment Framework for Domain Specific Applications

Advances in Formal Models of Search and Search Behaviour

Outcome-based Evaluation of Systematic Review Automation

The interactive PRP for diversifying document rankings

Representing EHRs with Temporal Tree and Sequential Pattern Mining for Similarity Computing

Automatic query expansion: A structural linguistic perspective

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction

Information retrieval as semantic inference: a Graph Inference model applied to medical search

Term associations in query expansion: a structural linguistic perspective

Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search

Robustness of Neural Rankers to Typos: A Comparative Study

External knowledge and query strategies in active learning: A study in clinical information extraction

Pseudo-Relevance Feedback with Dense Retrievers in Pyserini

Top-k Retrieval Using Facility Location Analysis

Federated Online Learning to Rank with Evolution Strategies: A Reproducibility Study

Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?

Building and Using Models of Information Seeking, Search and Retrieval

Exploring the Representation Power of SPLADE Models

Ranking Health Web Pages with Relevance and Understandability

Active learning: A step towards automating medical concept extraction

Document ranking with quantum probabilities

Generalizing Translation Models in the Probabilistic Relevance Framework

Estimating interference in the QPRP for subtopic retrieval

Fixed budget pooling strategies based on fusion methods

Integrating the Framing of Clinical Questions via PICO into the Retrieval of Medical Literature for Systematic Reviews

Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models

AgAsk: A Conversational Search Agent for Answering Agricultural Questions

Quantum haystacks revisited

On the use of Complex Numbers in Quantum Models for Information Retrieval

Using Emotion to Diversify Document Rankings

A task completion framework to support single-interaction IR research

Dependency-aware Self-training for Entity Alignment

A Query-Basis Approach to Parametrizing Novelty-Biased Cumulative Gain

Consumer Health Search on the Web: Study of Web Page Understandability and Its Integration in Ranking Algorithms (Preprint)

Task-oriented search for evidence-based medicine

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank Systems

Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study

Generating Better Queries for Systematic Reviews

Crowdsourcing interactions: using crowdsourcing for evaluating interactive information retrieval systems

Assessors Agreement: A Case Study Across Assessor Type, Payment Levels, Query Variations and Relevance Dimensions

Combining Word Semantics within Complex Hilbert Space for Information Retrieval

Impact of a Search Engine on Clinical Decisions Under Time and System Effectiveness Constraints: Research Protocol (Preprint)

Graph-based concept weighting for medical information retrieval

Fixed-Cost Pooling Strategies

Diagnose This If You Can

Overview of the ShARe/CLEF eHealth Evaluation Lab 2014

Payoffs and pitfalls in using knowledge-bases for consumer health search

Overview of the CLEF eHealth Evaluation Lab 2016

Causality Discovery with Domain Knowledge for Drug-Drug Interactions Discovery

Choices in Knowledge-Base Retrieval for Consumer Health Search

Recursive module extraction using Louvain and PageRank

Is the unigram relevance model term independent? Classifying term dependencies in query expansion

Clinical information extraction using small data: An active learning approach based on sequence representations and word embeddings

An Analysis of Ranking Principles and Retrieval Strategies

Exploiting inference from semantic annotations for information retrieval: Reflections from medical IR

A comprehensive analysis of parameter settings for novelty-biased cumulative gain

A test collection for evaluating retrieval of studies for inclusion in systematic reviews

Has portfolio theory got any principles?

On the Volatility of Commercial Search Engines and its Impact on Information Retrieval Research

When Two Is Better Than One: A Study of Ranking Paradigms and Their Integrations for Subtopic Retrieval

A Formalization of Logical Imaging for Information Retrieval Using Quantum Theory

Deep Query Likelihood Model for Information Retrieval

Quality Matters: Understanding the Impact of Incomplete Data on Visualization Recommendation

Query Variation Performance Prediction for Systematic Reviews

Efficient Diversification for Recommending Aggregate Data Visualizations

You Can Teach an Old Dog New Tricks: Rank Fusion applied to Coordination Level Matching for Ranking in Systematic Reviews

Extracting Cancer Mortality Statistics from Death Certificates: A Hybrid Machine Learning and Rule-based Approach for Common and Rare Cancers

Counterfactual Online Learning to Rank

An Analysis of the Cost and Benefit of Search Interactions

A Computational Approach for Objectively Derived Systematic Review Search Strategies

Revisiting Sub–topic Retrieval in the ImageCLEF 2009 Photo Retrieval Task