ARDC Research Link Australia

Publication

On Combining Boosting with Rule-Induction for Automated Fruit Grading

Publisher: Springer Netherlands

Date: 2014

DOI: 10.1007/978-94-017-9115-1_21

Publication

Stream Processing of Geometric and Central Moments Using High Precision Summed Area Tables

Publisher: Springer Berlin Heidelberg

Date: 2009

DOI: 10.1007/978-3-642-02490-0_133

Publication

A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

Publisher: Springer Science and Business Media LLC

Date: 14-08-2021

DOI: 10.1186/S40537-021-00499-7

Abstract: This article proposes a new parallel performance model for different workloads of Spark Big Data applications running on Hadoop clusters. The proposed model can predict the runtime for generic workloads as a function of the number of executors, without necessarily knowing how the algorithms were implemented. For a certain problem size, it is shown that a model based on serial boundaries for a 2D arrangement of executors can fit the empirical data for various workloads. The empirical data was obtained from a real Hadoop cluster, using Spark and HiBench. The workloads used in this work were included WordCount, SVM, Kmeans, PageRank and Graph (Nweight). A particular runtime pattern emerged when adding more executors to run a job. For some workloads, the runtime was longer with more executors added. This phenomenon is predicted with the new model of parallelisation. The resulting equation from the model explains certain performance patterns that do not fit Amdahl’s law predictions, nor Gustafson’s equation. The results show that the proposed model achieved the best fit with all workloads and most of the data sizes, using the R-squared metric for the accuracy of the fitting of empirical data. The proposed model has advantages over machine learning models due to its simplicity, requiring a smaller number of experiments to fit the data. This is very useful to practitioners in the area of Big Data because they can predict runtime of specific applications by analysing the logs. In this work, the model is limited to changes in the number of executors for a fixed problem size.

Publication

Individualized Prediction of Transition to Psychosis in 1,676 Individuals at Clinical High Risk: Development and Validation of a Multivariable Prediction Model Based on Individual Patient Data Meta-An

Publisher: Frontiers Media SA

Date: 21-05-2019

DOI: 10.3389/FPSYT.2019.00345

Publication

Perspectives on the challenges of generalizability, transparency and ethics in predictive learning analytics

Publisher: Elsevier BV

Date: 12-2021

DOI: 10.1016/J.CAEO.2021.100060

Publication

Stream Processing of Integral Images for Real-Time Object Detection

Publisher: IEEE

Date: 2008

DOI: 10.1109/PDCAT.2008.46

Publication

A Modular Approach to Training Cascades of Boosted Ensembles

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-14980-1_63

Publication

An Enhanced Parallelisation Model for Performance Prediction of Apache Spark on a Multinode Hadoop Cluster

Publisher: MDPI AG

Date: 05-11-2021

DOI: 10.3390/BDCC5040065

Abstract: Big data frameworks play a vital role in storing, processing, and analysing large datasets. Apache Spark has been established as one of the most popular big data engines for its efficiency and reliability. However, one of the significant problems of the Spark system is performance prediction. Spark has more than 150 configurable parameters, and configuration of so many parameters is challenging task when determining the suitable parameters for the system. In this paper, we proposed two distinct parallelisation models for performance prediction. Our insight is that each node in a Hadoop cluster can communicate with identical nodes, and a certain function of the non-parallelisable runtime can be estimated accordingly. Both models use simple equations that allows us to predict the runtime when the size of the job and the number of executables are known. The proposed models were evaluated based on five HiBench workloads, Kmeans, PageRank, Graph (NWeight), SVM, and WordCount. The workload’s empirical data were fitted with one of the two models meeting the accuracy requirements. Finally, the experimental findings show that the model can be a handy and helpful tool for scheduling and planning system deployment.

Publication

A Hybrid Fuzzy-Genetic Colour Classification System with Best Colour Space Selection under Dynamically-Changing Illumination

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-17534-3_36

Publication

Navel Orange Blemish Identification for Quality Grading System

Publisher: Springer Berlin Heidelberg

Date: 2009

DOI: 10.1007/978-3-642-10684-2_75

Publication

Hybrid Fuzzy Colour Processing and Learning

Publisher: Springer Berlin Heidelberg

Date: 2008

DOI: 10.1007/978-3-540-69162-4_40

Publication

Stream processing for fast and efficient rotated Haar-like features using rotated integral images

Publisher: Inderscience Publishers

Date: 2009

DOI: 10.1504/IJISTA.2009.025105

Publication

Automatic alignment and comparison on images of petri dishes containing cell colonies

Publisher: IEEE

Date: 11-2015

DOI: 10.1109/IVCNZ.2015.7761512

Publication

Multi-Behaviour Robot Control using Genetic Network Programming with Fuzzy Reinforcement Learning

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-16841-8_15

Publication

Towards 3D Human Action Recognition Using a Distilled CNN Model

Publisher: IEEE

Date: 07-2018

DOI: 10.1109/SIPROCESS.2018.8600485

Publication

Tuning Fuzzy-Based Hybrid Navigation Systems Using Calibration Maps

Publisher: Springer Berlin Heidelberg

Date: 2013

DOI: 10.1007/978-3-642-37374-9_68

Publication

Fast and Smooth Replanning for Navigation in Partially Unknown Terrain: The Hybrid Fuzzy-D*lite Algorithm

Publisher: Springer International Publishing

Date: 09-07-2201

DOI: 10.1007/978-3-319-31293-4_3

Publication

Incremental Improvement for Sub-optimal Euclidean TSP Paths Generated by Traditional Heuristics

Publisher: IEEE

Date: 16-12-2020

DOI: 10.1109/CSDE50874.2020.9411580

Publication

Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer

Publisher: SPIE-Intl Soc Optical Eng

Date: 07-08-2018

DOI: 10.1117/1.JEI.27.4.043040

Publication

Utilization of voronoi diagrams for circularity algorithms

Publisher: Elsevier BV

Date: 05-1997

DOI: 10.1016/S0141-6359(97)00044-5

Publication

Coarse-to-fine multiclass learning and classification for time-critical domains

Publisher: Elsevier BV

Date: 06-2013

DOI: 10.1016/J.PATREC.2013.01.011

Publication

Automated assessment system for programming courses: a case study for teaching data structures and algorithms

Publisher: Springer Science and Business Media LLC

Date: 15-08-2023

DOI: 10.1007/S11423-023-10277-2

Abstract: An important course in the computer science discipline is ‘ Data Structures and Algorithms’ (DSA). The coursework lays emphasis on experiential learning for building students’ programming and algorithmic reasoning abilities. Teachers set up a repertoire of formative programming exercises to engage students with different programmatic scenarios to build their know-what, know-how and know-why competencies. Automated assessment tools can assist teachers in inspecting, marking, and grading of programming exercises and also support them in providing students with formative feedback in real-time. This article describes the design of a bespoke automarker that was integrated into the DSA coursework and therefore served as an instructional tool. Activity theory has provided the pedagogical lens to examine how the automarker-mediated instructional strategy enabled self-reflection and assisted students in their formative learning journey. Learner experiences gathered from 39 students enrolled in DSA course shows that the automarker facilitated practice-based learning to advance students know-what, know-why and know-how skills. This study contributes to both curricula and pedagogic practice by showcasing the integration of an automated assessment strategy with programming-related coursework to inform future teaching and assessment practice.

Publication

Adaptive cascade of boosted ensembles for face detection in concept drift

Publisher: Springer Science and Business Media LLC

Date: 17-06-2011

DOI: 10.1007/S00521-011-0663-X

Publication

Stream processing of moment invariants for real-time classifiers

Publisher: IEEE

Date: 04-2005

DOI: 10.1109/ICARA.2000.4804024

Publication

Colour segmentation for multiple low dynamic range images using boosted cascaded classifiers

Publisher: IEEE

Date: 11-2013

DOI: 10.1109/IVCNZ.2013.6727005

Publication

An Investigation of Skeleton-Based Optical Flow-Guided Features for 3D Action Recognition Using a Multi-Stream CNN Model

Publisher: IEEE

Date: 06-2018

DOI: 10.1109/ICIVC.2018.8492894

Publication

Real-Time Fuzzy Logic-based Hybrid Robot Path-Planning Strategies for a Dynamic Environment

Publisher: IGI Global

Date: 2013

DOI: 10.4018/978-1-4666-3942-3.CH006

Abstract: This chapter sets out to explore the intricacies behind developing a hybrid system for real-time autonomous robot navigation, with target pursuit and obstacle avoidance behaviour, in a dynamic environment. Three complete systems are described, namely, a cascade of four fuzzy systems, a hybrid fuzzy A* system, and a hybrid fuzzy A* with a Voronoi diagram. A highly reconfigurable integration architecture is presented, allowing for the harmonious interplay between the different component algorithms, with the option of engaging or disengaging from the system. The utilization of both global and local information about the environment is examined, as well as an additional optimal global path-planning layer. Moreover, how a fuzzy system design approach could take advantage of the presence of symmetry in the input space, cutting down the number of rules and membership functions, without sacrificing control precision is illustrated. The efficiency of all the algorithms is demonstrated by employing them in a simulation of a real-world system: the robot soccer game. Results indicate that the hybrid system can generate smooth, near-shortest paths, as well as near-shortest-safest paths, when all component algorithms are activated. A systematic approach to calibrating the system is also provided.

Publication

Performance Analysis of Multi-Node Hadoop Cluster Based on Large Data Sets

Publisher: IEEE

Date: 16-12-2020

DOI: 10.1109/CSDE50874.2020.9411587

Publication

Real-time Rotationally Invariant Features for Environmental Feature Detection by Mobile Robots Sensor Networks

Publisher: IEEE

Date: 10-2007

DOI: 10.1109/ROSE.2007.4373961

Publication

Wisdom of Crowds: An Empirical Study of Ensemble-Based Feature Selection Strategies

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-26350-2_47

Publication

Characterisation of the Discriminative Properties of the Radial Tchebichef Moments for Hand-written Digits

Publisher: ACM

Date: 19-11-2014

DOI: 10.1145/2683405.2683433

Publication

Adaptive Ensemble Based Learning in Non-stationary Environments with Variable Concept Drift

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-17537-4_54

Publication

Empirical evaluation of a new structure for AdaBoost

Publisher: ACM

Date: 16-03-2008

DOI: 10.1145/1363686.1364109

Publication

A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

Publisher: Springer Science and Business Media LLC

Date: 12-2020

DOI: 10.1186/S40537-020-00388-5

Abstract: Big Data analytics for storing, processing, and analyzing large-scale datasets has become an essential tool for the industry. The advent of distributed computing frameworks such as Hadoop and Spark offers efficient solutions to analyze vast amounts of data. Due to the application programming interface (API) availability and its performance, Spark becomes very popular, even more popular than the MapReduce framework. Both these frameworks have more than 150 parameters, and the combination of these parameters has a massive impact on cluster performance. The default system parameters help the system administrator deploy their system applications without much effort, and they can measure their specific cluster performance with factory-set parameters. However, an open question remains: can new parameter selection improve cluster performance for large datasets? In this regard, this study investigates the most impacting parameters, under resource utilization, input splits, and shuffle, to compare the performance between Hadoop and Spark, using an implemented cluster in our laboratory. We used a trial-and-error approach for tuning these parameters based on a large number of experiments. In order to evaluate the frameworks of comparative analysis, we select two workloads: WordCount and TeraSort. The performance metrics are carried out based on three criteria: execution time, throughput, and speedup. Our experimental results revealed that both system performances heavily depends on input data size and correct parameter selection. The analysis of the results shows that Spark has better performance as compared to Hadoop when data sets are small, achieving up to two times speedup in WordCount workloads and up to 14 times in TeraSort workloads when default parameter values are reconfigured.

Publication

Colour Object Classification Using the Fusion of Visible and Near-Infrared Spectra

Publisher: Springer Berlin Heidelberg

Date: 2010

DOI: 10.1007/978-3-642-15246-7_46

Publication

Real-Time Fuzzy Logic-Based Hybrid Robot Path-Planning Strategies for a Dynamic Environment

Publisher: IGI Global

Date: 2013

DOI: 10.4018/978-1-4666-4607-0.CH076

Abstract: This chapter sets out to explore the intricacies behind developing a hybrid system for real-time autonomous robot navigation, with target pursuit and obstacle avoidance behaviour, in a dynamic environment. Three complete systems are described, namely, a cascade of four fuzzy systems, a hybrid fuzzy A* system, and a hybrid fuzzy A* with a Voronoi diagram. A highly reconfigurable integration architecture is presented, allowing for the harmonious interplay between the different component algorithms, with the option of engaging or disengaging from the system. The utilization of both global and local information about the environment is examined, as well as an additional optimal global path-planning layer. Moreover, how a fuzzy system design approach could take advantage of the presence of symmetry in the input space, cutting down the number of rules and membership functions, without sacrificing control precision is illustrated. The efficiency of all the algorithms is demonstrated by employing them in a simulation of a real-world system: the robot soccer game. Results indicate that the hybrid system can generate smooth, near-shortest paths, as well as near-shortest-safest paths, when all component algorithms are activated. A systematic approach to calibrating the system is also provided.

Publication

Reducing IO bandwidth for GPU based moment invariant classifier systems

Publisher: IEEE

Date: 05-2009

DOI: 10.1109/IMTC.2009.5168636

Publication

Adaptive Colour Calibration for Object Tracking under Spatially-Varying Illumination Environments

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-24965-5_56

Publication

Classifier and Feature Based Stereo for Mobile Robot Systems

Publisher: IEEE

Date: 05-2008

DOI: 10.1109/IMTC.2008.4547182

Publication

A Hybrid Fuzzy Q-learning algorithm for robot navigation

Publisher: IEEE

Date: 07-2011

DOI: 10.1109/IJCNN.2011.6033561

Publication

RGB-D and Thermal Sensor Fusion: A Systematic Literature Review

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/ACCESS.2023.3301119

Publication

Autonomous Navigation in Partially Known Confounding Maze-Like Terrains Using D*Lite with Poisoned Reverse

Publisher: IEEE

Date: 04-2014

DOI: 10.1109/DISA.2018.8490604

Publication

Accelerated Classifier Training Using the PSL Cascading Structure

Publisher: Springer Berlin Heidelberg

Date: 2009

DOI: 10.1007/978-3-642-02490-0_115

Publication

A New Ensemble-Based Cascaded Framework for Multiclass Training with Simple Weak Learners

Publisher: Springer Berlin Heidelberg

Date: 2011

DOI: 10.1007/978-3-642-23672-3_68

Andre Barczak

Researcher

Related Links

Publications

On Combining Boosting with Rule-Induction for Automated Fruit Grading

Stream Processing of Geometric and Central Moments Using High Precision Summed Area Tables

A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

Individualized Prediction of Transition to Psychosis in 1,676 Individuals at Clinical High Risk: Development and Validation of a Multivariable Prediction Model Based on Individual Patient Data Meta-An

Perspectives on the challenges of generalizability, transparency and ethics in predictive learning analytics

Stream Processing of Integral Images for Real-Time Object Detection

A Modular Approach to Training Cascades of Boosted Ensembles

An Enhanced Parallelisation Model for Performance Prediction of Apache Spark on a Multinode Hadoop Cluster

A Hybrid Fuzzy-Genetic Colour Classification System with Best Colour Space Selection under Dynamically-Changing Illumination

Navel Orange Blemish Identification for Quality Grading System

Hybrid Fuzzy Colour Processing and Learning

Stream processing for fast and efficient rotated Haar-like features using rotated integral images

Automatic alignment and comparison on images of petri dishes containing cell colonies

Multi-Behaviour Robot Control using Genetic Network Programming with Fuzzy Reinforcement Learning

Towards 3D Human Action Recognition Using a Distilled CNN Model

Tuning Fuzzy-Based Hybrid Navigation Systems Using Calibration Maps

Fast and Smooth Replanning for Navigation in Partially Unknown Terrain: The Hybrid Fuzzy-D*lite Algorithm

Incremental Improvement for Sub-optimal Euclidean TSP Paths Generated by Traditional Heuristics

Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer

Utilization of voronoi diagrams for circularity algorithms

Coarse-to-fine multiclass learning and classification for time-critical domains

Automated assessment system for programming courses: a case study for teaching data structures and algorithms

Adaptive cascade of boosted ensembles for face detection in concept drift

Stream processing of moment invariants for real-time classifiers

Colour segmentation for multiple low dynamic range images using boosted cascaded classifiers

An Investigation of Skeleton-Based Optical Flow-Guided Features for 3D Action Recognition Using a Multi-Stream CNN Model

Real-Time Fuzzy Logic-based Hybrid Robot Path-Planning Strategies for a Dynamic Environment

Performance Analysis of Multi-Node Hadoop Cluster Based on Large Data Sets

Real-time Rotationally Invariant Features for Environmental Feature Detection by Mobile Robots Sensor Networks

Wisdom of Crowds: An Empirical Study of Ensemble-Based Feature Selection Strategies

Characterisation of the Discriminative Properties of the Radial Tchebichef Moments for Hand-written Digits

Adaptive Ensemble Based Learning in Non-stationary Environments with Variable Concept Drift

Empirical evaluation of a new structure for AdaBoost

A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

Colour Object Classification Using the Fusion of Visible and Near-Infrared Spectra

Real-Time Fuzzy Logic-Based Hybrid Robot Path-Planning Strategies for a Dynamic Environment

Reducing IO bandwidth for GPU based moment invariant classifier systems

Adaptive Colour Calibration for Object Tracking under Spatially-Varying Illumination Environments

Classifier and Feature Based Stereo for Mobile Robot Systems

A Hybrid Fuzzy Q-learning algorithm for robot navigation

RGB-D and Thermal Sensor Fusion: A Systematic Literature Review

Autonomous Navigation in Partially Known Confounding Maze-Like Terrains Using D*Lite with Poisoned Reverse

Accelerated Classifier Training Using the PSL Cascading Structure

A New Ensemble-Based Cascaded Framework for Multiclass Training with Simple Weak Learners

Related Organisations

Massey University - Albany Campus

Massey University - Auckland Campus

Bond University

Universidade Estadual De Campinas

Related Funding Activities

ARDC NEWSLETTER SIGNUP