ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-5734-3691

Current Organisations
Monash University Malaysia , University of Malaya

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

Clustering of Relevant Documents Based on Findability Effort in Information Retrieval

Publisher: IGI Global

Date: 06-01-2023

DOI: 10.4018/IJIRR.315764

Abstract: A user expresses their information need in the form of a query on an information retrieval (IR) system that retrieves a set of articles related to the query. The performance of the retrieval system is measured based on the retrieved content to the query, judged by expert topic assessors who are trained to find this relevant information. However, real users do not always succeed in finding relevant information in the retrieved list due to the amount of time and effort needed. This paper aims 1) to utilize the findability features to determine the amount of effort needed to find information from relevant documents using the machine learning approach and 2) to demonstrate changes in IR systems' performance when the effort is included in the evaluation. This study uses a natural language processing technique and unsupervised clustering approach to group documents by the amount of effort needed. The results show that relevant documents can be clustered using the k-means clustering approach, and the retrieval system performance varies by 23%, on average.

Publication

Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment

Publisher: Emerald

Date: 21-01-2019

DOI: 10.1108/AJIM-04-2018-0086

Abstract: The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges’ involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation. Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents’ effort while ensuring outlier scores do not skew the grading of the entire set of documents. The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents. Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.

Publication

Association Between Logical Reasoning Ability and Quality of Relevance Judgments in Crowdsourcing

Publisher: IEEE

Date: 03-2018

DOI: 10.1109/INFRKM.2018.8464689

Publication

Quality of crowdsourced relevance judgments in association with logical reasoning ability

Publisher: Univ. of Malaya

Date: 28-12-2018

DOI: 10.22452/MJCS.SP2018NO1.6

Publication

Ranking retrieval systems using pseudo relevance judgments

Publisher: Emerald

Date: 16-11-2015

DOI: 10.1108/AJIM-03-2015-0046

Abstract: – In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. – This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs and document rankings to generate the alternate methods. – The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. – Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Publication

Exploring Topic Difficulty in Information Retrieval Systems Evaluation

Publisher: IOP Publishing

Date: 12-2019

DOI: 10.1088/1742-6596/1339/1/012019

Abstract: Experimental or relevance assessment cost as well as reliability of an information retrieval (IR) evaluation is highly correlated to the number of topics used. The need of many assessors to produce equivalent large relevance judgments often incurs high cost and time. So, large number of topics in retrieval experiment is not practical and economical. This experiment proposes an approach to identify most effective topics in evaluating IR systems with regards to topic difficulty. The proposed approach is capable of identifying which topics and topic set size are reliable when evaluating system effectiveness. Easy topics appeared to be most suitable for effectively evaluating IR systems.

Publication

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

Publisher: Emerald

Date: 20-07-2015

DOI: 10.1108/AJIM-12-2014-0171

Abstract: – The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries. – Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document’s weight, which play the role of the mean average precision (MAP) score of the systems as a significance test’s statics. The experiments were conducted using the TREC 9 Web track collection. – The p -values generated through the two types of significance tests, namely the Student’s t -test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores. – Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.

Related Organisations

Organisation

Monash University Malaysia

Location: Malaysia

View Organisation

Organisation

Multimedia University Faculty Of Engineering

Location: Malaysia

View Organisation

Organisation

Universiti Malaya

Location: Malaysia

View Organisation

Organisation

University Of Malaya

Location: Malaysia

View Organisation

Related Funding Activities

No related grants have been discovered for Prabha Rajagopal.

Prabha Rajagopal

Researcher

Related Links

Publications

Clustering of Relevant Documents Based on Findability Effort in Information Retrieval

Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment

Association Between Logical Reasoning Ability and Quality of Relevance Judgments in Crowdsourcing

Quality of crowdsourced relevance judgments in association with logical reasoning ability

Ranking retrieval systems using pseudo relevance judgments

Exploring Topic Difficulty in Information Retrieval Systems Evaluation

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

Related Organisations

Monash University Malaysia

Multimedia University Faculty Of Engineering

Universiti Malaya

University Of Malaya

Related Funding Activities

ARDC NEWSLETTER SIGNUP