ORCID Profile
0000-0002-3923-3499
Current Organisation
University of Sydney
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Pattern Recognition and Data Mining | Computer Hardware | Computational neuroscience (incl. mathematical neuroscience and theoretical neuroscience) | Artificial Intelligence and Image Processing | Neural networks | Microelectronics | Electronics sensors and digital hardware | Distributed and Grid Systems | Processor Architectures
Market-Based Mechanisms | Industry Costs and Structure | Technological and Organisational Innovation | Expanding Knowledge in the Information and Computing Sciences |
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 04-2018
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2022
Publisher: IEEE
Date: 12-2010
Publisher: IEEE
Date: 2008
Publisher: Springer International Publishing
Date: 2016
Publisher: The Optical Society
Date: 31-01-2017
DOI: 10.1364/AO.56.001113
Publisher: IEEE
Date: 12-2009
Publisher: IEEE
Date: 2005
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2007
Publisher: IEEE
Date: 05-2017
Publisher: IEEE
Date: 04-2008
DOI: 10.1109/FCCM.2008.19
Publisher: IEEE
Date: 07-2016
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 06-2006
DOI: 10.1109/TC.2006.81
Publisher: IEEE
Date: 09-2011
DOI: 10.1109/FPL.2011.62
Publisher: Association for Computing Machinery (ACM)
Date: 22-03-2017
DOI: 10.1145/2996468
Abstract: A summary of contributions made by significant papers from the first 25 years of the Field-Programmable Logic and Applications conference (FPL) is presented. The 27 papers chosen represent those which have most strongly influenced theory and practice in the field.
Publisher: IEEE
Date: 04-2006
DOI: 10.1109/FCCM.2006.71
Publisher: IEEE
Date: 10-2019
Publisher: IEEE
Date: 2006
Publisher: IEEE
Date: 2006
Publisher: IEEE
Date: 12-2010
Publisher: IEEE
Date: 2007
Publisher: OSA
Date: 2018
Publisher: ACM
Date: 25-01-2017
Publisher: IEEE
Date: 2010
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 11-2022
Publisher: Foundation for Open Access Statistic
Date: 2005
Publisher: Elsevier BV
Date: 04-2016
DOI: 10.1016/J.COMPMEDIMAG.2016.01.001
Abstract: The automatic annotation of medical images is a prerequisite for building comprehensive semantic archives that can be used to enhance evidence-based diagnosis, physician education, and biomedical research. Annotation also has important applications in the automatic generation of structured radiology reports. Much of the prior research work has focused on annotating images with properties such as the modality of the image, or the biological system or body region being imaged. However, many challenges remain for the annotation of high-level semantic content in medical images (e.g., presence of calcification, vessel obstruction, etc.) due to the difficulty in discovering relationships and associations between low-level image features and high-level semantic concepts. This difficulty is further compounded by the lack of labelled training data. In this paper, we present a method for the automatic semantic annotation of medical images that leverages techniques from content-based image retrieval (CBIR). CBIR is a well-established image search technology that uses quantifiable low-level image features to represent the high-level semantic content depicted in those images. Our method extends CBIR techniques to identify or retrieve a collection of labelled images that have similar low-level features and then uses this collection to determine the best high-level semantic annotations. We demonstrate our annotation method using retrieval via weighted nearest-neighbour retrieval and multi-class classification to show that our approach is viable regardless of the underlying retrieval strategy. We experimentally compared our method with several well-established baseline techniques (classification and regression) and showed that our method achieved the highest accuracy in the annotation of liver computed tomography (CT) images.
Publisher: Springer Berlin Heidelberg
Date: 2003
Publisher: IEEE
Date: 2006
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 1995
DOI: 10.1109/72.471380
Abstract: The design, implementation, and operation of a low-power multilayer perceptron chip (Kakadu) in the framework of a cardiac arrhythmia classification system is presented in this paper. This classifier, called MATIC, makes timing decisions using a decision tree, and a neural network is used to identify heartbeats with abnormal morphologies. This classifier was designed to be suitable for use in implantable devices and a VLSI (very large scale integration) neural-network chip (Kakadu) was designed so that the computationally expensive neural-network algorithm can be implemented with low power consumption. Kakadu implements a (10,6,4) perceptron and has a typical power consumption of tens of microwatts. When used with the arrhythmia classification system, the chip can operate with an average power consumption of less than 25 nW.
Publisher: IEEE
Date: 2005
Publisher: IEEE
Date: 05-2020
Publisher: IEEE
Date: 08-2010
DOI: 10.1109/FPL.2010.89
Publisher: IEEE
Date: 05-2014
DOI: 10.1109/FCCM.2014.46
Publisher: IEEE
Date: 05-2015
DOI: 10.1109/FCCM.2015.11
Publisher: IMPERIAL COLLEGE PRESS
Date: 26-02-2015
Publisher: IEEE
Date: 09-2015
Publisher: Wiley
Date: 09-1992
DOI: 10.1111/J.1540-8159.1992.TB03142.X
Abstract: The use of an additional atrial sensing electrode together with a morphology recognition algorithm provides a significant improvement in classification performance over the current rate based algorithms used in implantable cardioverter defibrillator (ICD) devices. The classification system, called morphology and timing intracardiac classifier (MATIC), follows a classification process similar to that used by cardiologists. Timing between the atrial and ventricular channels is examined using a decision tree and forms the primary criterion for arrhythmia classification. A neural network based morphology classifier is used for cases such as ventricular tachycardia with 1:1 retrograde conduction where timing alone cannot make a reliable decision. MATIC achieves 99.6% correct classification on a database of intracardiac electrogram (ICEG) signals containing 12,483 QRS complexes recorded from 67 patients during electrophysiological studies. Arrhythmias in this database include sinus tachycardia, normal sinus rhythm, normal sinus rhythm with bundle branch block, sinus tachycardia with bundle branch block, atrial fibrillation (AF), various supraventricular tachycardias, ventricular tachycardia, ventricular tachycardia with 1:1 retrograde conduction, and ventricular fibrillation. Within these arrhythmias, there were numerous ventricular ectopic beats, fusion beats, noise, and other artifacts. MATIC addresses the classification problem from start to finish, inputs being raw intracardiac electrogram signals and the outputs being the recommended ICD therapy. Results achieved with MATIC were compared with a classifier used in the Telectronics Guardian ATP 4210, which achieved 75.9% correct classification on the same database. MATIC is simple and efficient, making it suitable for use in a low power implantable device.
Publisher: IEEE
Date: 2008
Publisher: IEEE
Date: 08-2009
Publisher: Walter de Gruyter GmbH
Date: 2000
Publisher: Association for Computing Machinery (ACM)
Date: 22-06-2023
DOI: 10.1145/3567429
Abstract: The spectral correlation density (SCD) is an important tool in cyclostationary signal detection and classification. Even using efficient techniques based on the fast Fourier transform (FFT), real-time implementations are challenging because of the high computational complexity. A key dimension for computational optimization lies in minimizing the wordlength employed. In this article, we analyze the relationship between wordlength and signal-to-quantization noise in fixed-point implementations of the SCD function. A canonical SCD estimation algorithm, the FFT accumulation method (FAM) using fixed-point arithmetic, is studied. We derive closed-form expressions for SQNR and compare them at wordlengths ranging from 14 to 26 bits. The differences between the calculated SQNR and bit-exact simulations are less than 1 dB. Furthermore, an HLS-based FPGA design is implemented on a Xilinx Zynq UltraScale+ XCZU28DR-2FFVG1517E RFSoC. Using less than 25% of the logic fabric on the device, it consumes 7.7 W total on-chip power and has a power efficiency of 12.4 GOPS/W, which is an order of magnitude improvement over an Nvidia Tesla K40 graphics processing unit (GPU) implementation. In terms of throughput, it achieves 50 MS/sec, which is a speedup of 1.6 over a recent optimized FPGA implementation.
Publisher: IEEE
Date: 12-2013
Publisher: Institution of Engineering and Technology (IET)
Date: 1999
DOI: 10.1049/EL:19991132
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 12-2011
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2002
Publisher: Wiley
Date: 04-01-2017
DOI: 10.1111/COIN.12083
Publisher: IEEE
Date: 10-2006
Publisher: Association for Computing Machinery (ACM)
Date: 21-06-2023
DOI: 10.1145/3568992
Abstract: Machine learning ensembles combine multiple base models to produce a more accurate output. They can be applied to a range of machine learning problems, including anomaly detection. In this article, we investigate how to maximize the composability and scalability of an FPGA-based streaming ensemble anomaly detector (fSEAD). To achieve this, we propose a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors. Our proof-of-concept design supports three state-of-the-art anomaly detection algorithms: Loda, RS-Hash, and xStream. Each algorithm is scalable, meaning multiple instances can be placed within a pblock to improve performance. Moreover, fSEAD is implemented using High-level synthesis (HLS), meaning further custom anomaly detectors can be supported. Pblocks are interconnected via an AXI-switch, enabling them to be composed in an arbitrary fashion before combining and merging results at runtime to create an ensemble that maximizes the use of FPGA resources and accuracy. Through utilizing reconfigurable Dynamic Function eXchange (DFX), the detector can be modified at runtime to adapt to changing environmental conditions. We compare fSEAD to an equivalent central processing unit (CPU) implementation using four standard datasets, with speedups ranging from 3× to 8×.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2017
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2017
Publisher: Springer Science and Business Media LLC
Date: 08-11-2008
Publisher: Institution of Engineering and Technology (IET)
Date: 2007
Publisher: Springer Berlin Heidelberg
Date: 1999
Publisher: IEEE
Date: 2005
Publisher: IEEE
Date: 2005
Publisher: Springer Berlin Heidelberg
Date: 2004
Publisher: IEEE
Date: 04-2015
Publisher: IEEE
Date: 2008
Publisher: IEEE
Date: 2008
Publisher: IEEE
Date: 12-2013
Publisher: Springer Berlin Heidelberg
Date: 2002
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2007
DOI: 10.1109/MPRV.2007.4
Publisher: ACM
Date: 15-02-2018
Publisher: Springer Singapore
Date: 2015
Publisher: IEEE
Date: 09-2011
DOI: 10.1109/FPL.2011.105
Publisher: SAGE Publications
Date: 2012
DOI: 10.1255/JNIRS.975
Abstract: Optical coherence tomography (OCT) is a technique that is able to provide cross section views of tissue layers. This fast and non-invasive method is widely used in clinical applications for the diagnosis and treatment of certain diseases. Although conventional OCT is derived from the theory of interferometric imaging, emerging developments, including spectroscopic OCT and related techniques such as dual-band OCT and Raman spectroscopy–OCT, have resulted in significantly improved clinical capabilities for observing the tissue layers through enhanced tissue definition, image resolution, image contrast and scanning speed. This paper reviews the state-of-the-art developments of OCT. It starts with a general introduction of conventional interferometric OCT imaging methods including the time-domain and frequency-domain techniques. The second section explores the advances introduced from spectroscopy techniques in OCT, especially with spectroscopic OCT, dual-band OCT and Raman spectroscopy combined OCT. The final section discusses the current challenges in the application of approaches based on computer-aided diagnosis (CAD) for retinal imaging, for ex le automated segmentation of tissue layers and tracking disease progression. This task is currently limited by the quality of the recorded data from OCT systems but will be improved by adopting spectroscopic techniques. Finally, we analyse and discuss the improvements that are expected in retinal CAD from the adoption of newly emerging near infrared spectroscopy OCT at multiple wavelengths.
Publisher: ACM
Date: 15-02-2018
Publisher: IEEE Comput. Soc. Press
Date: 1993
Publisher: Acoustical Society of America (ASA)
Date: 2000
DOI: 10.1121/1.428350
Abstract: A computational model of auditory localization resulting in performance similar to humans is reported. The model incorporates both the monaural and binaural cues available to a human for sound localization. Essential elements used in the simulation of the processes of auditory cue generation and encoding by the nervous system include measured head-related transfer functions (HRTFs), minimum audible field (MAF), and the Patterson–Holdsworth cochlear model. A two-layer feed-forward back-propagation artificial neural network (ANN) was trained to transform the localization cues to a two-dimensional map that gives the direction of the sound source. The model results were compared with (i) the localization performance of the human listener who provided the HRTFs for the model and (ii) the localization performance of a group of 19 other human listeners. The localization accuracy and front–back confusion error rates exhibited by the model were similar to both the single listener and the group results. This suggests that the simulation of the cue generation and extraction processes as well as the model parameters were reasonable approximations to the overall biological processes. The litude resolution of the monaural spectral cues was varied and the influence on the model’s performance was determined. The model with 128 cochlear channels required an litude resolution of approximately 20 discrete levels for encoding the spectral cue to deliver similar localization performance to the group of human listeners.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 02-2001
DOI: 10.1109/92.920833
Publisher: IEEE Comput. Soc
Date: 2000
Publisher: Springer Science and Business Media LLC
Date: 21-03-2016
DOI: 10.1038/NCOMMS10853
Abstract: It is a fundamental challenge in quantum optics to deterministically generate indistinguishable single photons through non-deterministic nonlinear optical processes, due to the intrinsic coupling of single- and multi-photon-generation probabilities in these processes. Actively multiplexing photons generated in many temporal modes can decouple these probabilities, but key issues are to minimize resource requirements to allow scalability, and to ensure indistinguishability of the generated photons. Here we demonstrate the multiplexing of photons from four temporal modes solely using fibre-integrated optics and off-the-shelf electronic components. We show a 100% enhancement to the single-photon output probability without introducing additional multi-photon noise. Photon indistinguishability is confirmed by a fourfold Hong–Ou–Mandel quantum interference with a 91±16% visibility after subtracting multi-photon noise due to high pump power. Our demonstration paves the way for scalable multiplexing of many non-deterministic photon sources to a single near-deterministic source, which will be of benefit to future quantum photonic technologies.
Publisher: IEEE
Date: 2006
Publisher: IEEE Comput. Soc
Date: 2000
Publisher: IEEE
Date: 04-2009
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 07-2017
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2020
Publisher: IEEE
Date: 04-2015
Publisher: IEEE
Date: 12-2016
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 03-2013
Publisher: Association for Computing Machinery (ACM)
Date: 16-04-2015
DOI: 10.1145/2665073
Abstract: Runtime analysis provides an effective method for measuring the sensitivity of programs to rounding errors. To date, implementations have required significant changes to source code, detracting from their widespread application. In this work, we present an open source system that automates the quantitative analysis of floating point rounding errors through the use of C-based source-to-source compilation and a Monte Carlo arithmetic library. We demonstrate its application to the comparison of algorithms, detection of catastrophic cancellation, and determination of whether single precision floating point provides sufficient accuracy for a given application. Methods for obtaining quantifiable measurements of sensitivity to rounding error are also detailed.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2005
DOI: 10.1109/MC.2005.337
Publisher: IEEE
Date: 2004
DOI: 10.1109/FCCM.2004.14
Publisher: IEEE Comput. Soc
Date: 2002
Publisher: IOP Publishing
Date: 24-04-2012
DOI: 10.1088/0967-3334/33/5/817
Abstract: Electrode contact impedance is a crucial factor in physiological measurements and can be an accuracy-limiting factor when performing electroencephalography and electrical impedance tomography. In this work, standard flat electrodes and micromachined multipoint spiked electrodes are characterized with a finite-element method electromagnetic solver and the dependence of the contact impedance on geometrical factors is explored. It is found that flat electrodes are sensitive to changes in the outer skin layer properties related to hydration and thickness, while spike electrodes are not. The impedance as a function of the effective contact area, number of spikes and penetration depth has also been studied and characterized.
Publisher: Association for Computing Machinery (ACM)
Date: 10-05-2022
DOI: 10.1145/3503465
Abstract: Recent years have seen an explosion of machine learning applications implemented on Field-Programmable Gate Arrays (FPGAs) . FPGA vendors and researchers have responded by updating their fabrics to more efficiently implement machine learning accelerators, including innovations such as enhanced Digital Signal Processing (DSP) blocks and hardened systolic arrays. Evaluating architectural proposals is difficult, however, due to the lack of publicly available benchmark circuits. This paper addresses this problem by presenting an open-source benchmark circuit generator that creates realistic DNN-oriented circuits for use in FPGA architecture studies. Unlike previous generators, which create circuits that are agnostic of the underlying FPGA, our circuits explicitly instantiate embedded blocks, allowing for meaningful comparison of recent architectural proposals without the need for a complete inference computer-aided design (CAD) flow. Our circuits are compatible with the VTR CAD suite, allowing for architecture studies that investigate routing congestion and other low-level architectural implications. In addition to addressing the lack of machine learning benchmark circuits, the architecture exploration flow that we propose allows for a more comprehensive evaluation of FPGA architectures than traditional static benchmark suites. We demonstrate this through three case studies which illustrate how realistic benchmark circuits can be generated to target different heterogeneous FPGAs.
Publisher: IEEE
Date: 03-2008
Publisher: Springer Science and Business Media LLC
Date: 18-06-2003
Publisher: IEEE Comput. Soc
Date: 2002
Publisher: Elsevier BV
Date: 02-2017
Publisher: IEEE
Date: 06-2018
Publisher: Association for Computing Machinery (ACM)
Date: 13-02-2020
DOI: 10.1145/3376924
Abstract: Kernel adaptive filters (KAFs) are non-linear filters which can adapt temporally and have the additional benefit of being computationally efficient through use of the “kernel trick”. In a number of real-world applications, such as channel equalisation, the non-linear mapping provides significant improvements over conventional linear techniques such as the least mean squares (LMS) and recursive least squares (RLS) algorithms. Prior works have focused mainly on the theory and accuracy of KAFs, with little research on their implementations. This article proposes several variants of algorithms based on the kernel normalised least mean squares (KNLMS) algorithm which utilise a delayed model update to minimise dependencies. Subsequently, this work proposes corresponding hardware architectures which utilise this delayed model update to achieve high s le rates and low latency while also providing high modelling accuracy. The resultant delayed KNLMS (DKNLMS) algorithms can achieve clock rates up to 12× higher than the standard KNLMS algorithm, with minimal impact on accuracy and stability. A system implementation achieves 250 GOps/s and a throughput of 187.4 MHz on an Ultra96 board with 1.8× higher throughput than previous state of the art.
Publisher: IEEE
Date: 12-2007
Publisher: IEEE
Date: 09-2017
Publisher: IEEE
Date: 05-2014
Publisher: IEEE
Date: 2006
Publisher: Springer Science and Business Media LLC
Date: 06-1993
DOI: 10.1007/BF01581960
Publisher: Institution of Engineering and Technology (IET)
Date: 1999
DOI: 10.1049/EL:19990998
Publisher: Springer International Publishing
Date: 2017
Publisher: IEEE
Date: 08-2018
Publisher: Institution of Engineering and Technology (IET)
Date: 2012
DOI: 10.1049/EL.2011.3651
Publisher: IEEE
Date: 12-2008
Publisher: IEEE Comput. Soc
Date: 2003
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 04-2017
Publisher: Elsevier BV
Date: 12-1997
DOI: 10.1016/S0378-5955(97)00161-5
Abstract: Measurement of localization performance will reflect errors that relate to the sensory processing of the cues to sound location and the errors associated with the method by which the subject indicates the perceived location. This study has measured the ability of human subjects to localize a short noise burst presented in the free field with the subject indicating the perceived location by pointing their nose towards the source. Subjects were first trained using a closed loop training paradigm which involved instantaneous feedback as to the accuracy of head pointing which resulted in the reduction of residual localization errors and a rapid acquisition of the task by the subjects. Once trained, 19 subjects localized between 4 and 6 blocks of 76 target locations. The data were pooled and the distribution of errors associated with each target location was examined using spherical methods. Errors in the localization estimates for about one third of the locations were rotationally symmetrical about their mean but the remaining locations were best described by an elliptical distribution (Kent distributed). For about one half of the latter locations the orientations of the directions of the greatest variance of the distributions were not aligned with the azimuth and elevation coordinates used for describing the spatial location of the targets. The accuracy (systematic errors) and the distribution of the errors (variance) in localization for our population of subjects were also examined for each test location. The size of the data set and the methods of analysis provide very reliable measures of important baseline parameters of human auditory localization.
Publisher: IEEE
Date: 04-2010
Publisher: IEEE
Date: 09-2015
Publisher: IEEE
Date: 12-2019
Publisher: ACM
Date: 12-02-2023
Publisher: IEEE
Date: 08-2018
Publisher: Springer Berlin Heidelberg
Date: 2001
Publisher: IEEE Comput. Soc. Press
Date: 1995
Publisher: IEEE
Date: 03-2008
Publisher: The Optical Society
Date: 11-10-2017
DOI: 10.1364/OE.25.026067
Publisher: IEEE
Date: 09-2015
Publisher: Foundation for Open Access Statistic
Date: 2016
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 02-2020
Publisher: Springer-Verlag
Date: 2005
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 05-2012
Publisher: IEEE
Date: 2005
Publisher: IEEE
Date: 12-2016
Publisher: IEEE
Date: 04-2019
Publisher: IEEE
Date: 08-2009
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 12-2009
Publisher: ACM
Date: 22-02-2012
Publisher: IEEE Comput. Soc
Date: 1998
Publisher: ACM
Date: 27-02-2011
Publisher: ACM
Date: 18-02-2007
Publisher: IEEE
Date: 2002
Publisher: Springer Singapore
Date: 2015
Publisher: Springer Singapore
Date: 2015
Publisher: IEEE
Date: 12-2019
Publisher: Springer Singapore
Date: 2015
Publisher: Springer Singapore
Date: 2015
Publisher: IEEE
Date: 08-2015
Publisher: Infopro Digital Services Limited
Date: 2016
Publisher: IEEE
Date: 03-2018
Publisher: IEEE
Date: 04-2013
Publisher: Association for Computing Machinery (ACM)
Date: 15-12-2017
DOI: 10.1145/3106744
Abstract: Kernel adaptive filters (KAFs) are online machine learning algorithms which are amenable to highly efficient streaming implementations. They require only a single pass through the data and can act as universal approximators, i.e. approximate any continuous function with arbitrary accuracy. KAFs are members of a family of kernel methods which apply an implicit non-linear mapping of input data to a high dimensional feature space, permitting learning algorithms to be expressed entirely as inner products. Such an approach avoids explicit projection into the feature space, enabling computational efficiency. In this paper, we propose the first fully pipelined implementation of the kernel normalised least mean squares algorithm for regression. Independent training tasks necessary for hyperparameter optimisation fill pipeline stages, so no stall cycles to resolve dependencies are required. Together with other optimisations to reduce resource utilisation and latency, our core achieves 161 GFLOPS on a Virtex 7 XC7VX485T FPGA for a floating point implementation and 211 GOPS for fixed point. Our PCI Express based floating-point system implementation achieves 80% of the core’s speed, this being a speedup of 10× over an optimised implementation on a desktop processor and 2.66× over a GPU.
Publisher: ACM
Date: 22-02-2009
Publisher: Association for Computing Machinery (ACM)
Date: 24-09-2016
DOI: 10.1145/2950061
Abstract: Kernel methods utilize linear methods in a nonlinear feature space and combine the advantages of both. Online kernel methods, such as kernel recursive least squares (KRLS) and kernel normalized least mean squares (KNLMS), perform nonlinear regression in a recursive manner, with similar computational requirements to linear techniques. In this article, an architecture for a microcoded kernel method accelerator is described, and high-performance implementations of sliding-window KRLS, fixed-budget KRLS, and KNLMS are presented. The architecture utilizes pipelining and vectorization for performance, and microcoding for reusability. The design can be scaled to allow tradeoffs between capacity, performance, and area. The design is compared with a central processing unit (CPU), digital signal processor (DSP), and Altera OpenCL implementations. In different configurations on an Altera Arria 10 device, our SW-KRLS implementation delivers floating-point throughput of approximately 16 GFLOPs, latency of 5.5μ S , and energy consumption of 10 − 4 J, these being improvements over a CPU by factors of 12, 17, and 24, respectively.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2020
Publisher: IEEE
Date: 1999
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2023
Publisher: IEEE
Date: 04-2012
DOI: 10.1109/FCCM.2012.16
Publisher: IEEE
Date: 2006
Publisher: IEEE Comput. Soc
Date: 1998
Publisher: IEEE
Date: 2007
Publisher: IEEE
Date: 2002
Publisher: IEEE
Date: 05-2018
Publisher: IEEE
Date: 2005
Publisher: OSA
Date: 2017
Publisher: ACM
Date: 23-02-2020
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2018
Publisher: IEEE
Date: 05-2011
DOI: 10.1109/FCCM.2011.51
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 12-2012
Publisher: IEEE Comput. Soc
Date: 2003
Publisher: Elsevier BV
Date: 04-1998
DOI: 10.1016/S0165-0270(97)00201-X
Abstract: A systematic analysis of the localization of objects in extra-personal space requires a three-dimensional method of documenting location. In auditory localization studies the location of a sound source is often reduced to a directional vector with constant magnitude with respect to the observer, data being plotted on a unit sphere with the observer at the origin. This is an attractive form of data representation as the relevant spherical statistical and graphical methods are well described. In this paper we collect together a set of spherical plotting and statistical procedures to visualize and summarize these data. We describe methods for visualizing auditory localization data without assuming that the principal components of the data are aligned with the coordinate system. As a means of comparing experimental techniques and having a common set of data for the verification of spherical statistics, the software (implemented in MATLAB) and database described in this paper have been placed in the public domain. Although originally intended for the visualization and summarization of auditory psychophysical data, these routines are sufficiently general to be applied in other situations involving spherical data.
Publisher: IEEE
Date: 05-2011
DOI: 10.1109/FCCM.2011.57
Publisher: Optica Publishing Group
Date: 06-08-2015
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2005
Publisher: World Scientific Pub Co Pte Lt
Date: 12-1993
DOI: 10.1142/S0129065793000316
Abstract: An analogue neural network VLSI chip designed for low power operation is presented. This chip consists of 84 synapse elements arranged as arrays of size 10 × 6 and 6 × 4 and was fabricated using a standard 1.2 μm double metal single poly CMOS process. The synapses are digitally programmable and static weight storage is provided. The chip has a typical power consumption of tens of microwatts. It has been successfully trained and tested on a range of classification problems including 4-bit parity, character recognition and morphological-based classification of intracardiac electrogram signals.
Publisher: IEEE
Date: 08-2019
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 02-2003
Publisher: Association for Computing Machinery (ACM)
Date: 18-10-2019
DOI: 10.1145/3359983
Abstract: The computational complexity of neural networks for large-scale or real-time applications necessitates hardware acceleration. Most approaches assume that the network architecture and parameters are unknown at design time, permitting usage in a large number of applications. This article demonstrates, for the case where the neural network architecture and ternary weight values are known a priori , that extremely high throughput implementations of neural network inference can be made by customising the datapath and routing to remove unnecessary computations and data movement. This approach is ideally suited to FPGA implementations as a specialized implementation of a trained network improves efficiency while still retaining generality with the reconfigurability of an FPGA. A VGG-style network with ternary weights and fixed point activations is implemented for the CIFAR10 dataset on Amazon’s AWS F1 instance. This article demonstrates how to remove 90% of the operations in convolutional layers by exploiting sparsity and compile-time optimizations. The implementation in hardware achieves 90.9 ± 0.1% accuracy and 122k frames per second, with a latency of only 29µs, which is the fastest CNN inference implementation reported so far on an FPGA.
Publisher: IEEE
Date: 1999
Publisher: IEEE
Date: 12-2009
Publisher: IEEE
Date: 08-2014
Publisher: American Physiological Society
Date: 10-2017
DOI: 10.1152/JAPPLPHYSIOL.00726.2016
Abstract: The forced oscillation technique (FOT) can provide unique and clinically relevant lung function information with little cooperation with subjects. However, FOT has higher variability than spirometry, possibly because strategies for quality control and reducing artifacts in FOT measurements have yet to be standardized or validated. Many quality control procedures rely on either simple statistical filters or subjective evaluation by a human operator. In this study, we propose an automated artifact removal approach based on the resistance against flow profile, applied to complete breaths. We report results obtained from data recorded from children and adults, with and without asthma. Our proposed method has 76% agreement with a human operator for the adult data set and 79% for the pediatric data set. Furthermore, we assessed the variability of respiratory resistance measured by FOT using within-session variation (wCV) and between-session variation (bCV). In the asthmatic adults test data set, our method was again similar to that of the manual operator for wCV (6.5 vs. 6.9%) and significantly improved bCV (8.2 vs. 8.9%). Our combined automated breath removal approach based on advanced feature extraction offers better or equivalent quality control of FOT measurements compared with an expert operator and computationally more intensive methods in terms of accuracy and reducing intrasubject variability. NEW & NOTEWORTHY The forced oscillation technique (FOT) is gaining wider acceptance for clinical testing however, strategies for quality control are still highly variable and require a high level of subjectivity. We propose an automated, complete breath approach for removal of respiratory artifacts from FOT measurements, using feature extraction and an interquartile range filter. Our approach offers better or equivalent performance compared with an expert operator, in terms of accuracy and reducing intrasubject variability.
Publisher: IEEE
Date: 2017
Publisher: Institution of Engineering and Technology (IET)
Date: 2007
Publisher: Springer Science and Business Media LLC
Date: 15-09-2017
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2023
Publisher: Elsevier BV
Date: 10-2021
Publisher: Optica Publishing Group
Date: 16-07-2018
DOI: 10.1364/OL.43.003469
Publisher: Association for Computing Machinery (ACM)
Date: 03-2008
Abstract: We present an architecture for a synthesizable datapath-oriented FPGA core that can be used to provide post-fabrication flexibility to an SoC. Our architecture is optimized for bus-based operations and employs a directional routing architecture, which allows it to be synthesized using standard ASIC design tools and flows. The primary motivation for this architecture is to provide an efficient mechanism to support on-chip debugging. The fabric can also be used to implement other datapath-oriented circuits such as those needed in signal processing and computation-intensive applications. We evaluate our architecture using a set of benchmark circuits and compare it to previous fabrics in terms of area, speed, and power.
Publisher: IEEE
Date: 12-2019
Publisher: IEEE
Date: 09-2012
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 03-2017
Publisher: IEEE
Date: 12-2015
Publisher: IEEE
Date: 12-2011
Publisher: IEEE
Date: 2006
Publisher: MDPI AG
Date: 26-02-2018
DOI: 10.3390/S18030693
Publisher: IEEE
Date: 05-2018
Publisher: IEEE
Date: 2005
Publisher: IEEE
Date: 08-2014
Publisher: Elsevier BV
Date: 10-2019
Publisher: IEEE
Date: 12-2017
Publisher: American Physical Society (APS)
Date: 19-12-2016
Publisher: Association for Computing Machinery (ACM)
Date: 22-12-2023
DOI: 10.1145/3546181
Abstract: The spectral correlation density (SCD) function is the time-averaged correlation of two spectral components used for analyzing periodic signals with time-varying spectral content. Although the analysis is extremely powerful, it has not been widely adopted in real-time applications due to its high computational complexity. In this article, we present an efficient FPGA implementation of the FFT accumulation method (FAM) for estimating the SCD function and its alpha profile. The implementation uses a linear systolic array with a bi-directional datapath consisting of DSP-based processing elements (PEs) with a dedicated instruction schedule, achieving a PE utilization of 88.2%. The 128-PE implementation achieves a clock frequency in excess of 530 MHz and consumes 151K LUTs, 151K FFs, 264 BRAMs, 4 URAMs, and 1,054 DSPs, which is less than 36% of the logic fabric on a Zynq UltraScale+ XCZU28DR-2FFVG1517E RFSoC device. It has a modest 12.5W power consumption and an energy efficiency of 4,832 MOPS/W, which is 20.6× better than the published state-of-the-art GPU implementation. In terms of throughput, it achieves 15,340 windows/s (15,340 windows/s × 2,048 s les/window = 31.4 MS/s), which is a 4.65× improvement compared to the above-mentioned GPU implementation and 807× compared to an existing hybrid FPGA-GPU implementation.
Publisher: IEEE
Date: 2006
Publisher: IEEE
Date: 03-2014
Publisher: IEEE
Date: 2005
Publisher: Springer Berlin Heidelberg
Date: 2002
Publisher: IEEE
Date: 2006
Publisher: IEEE
Date: 12-2006
Publisher: IEEE
Date: 2008
Publisher: IEEE
Date: 12-2015
Publisher: Springer International Publishing
Date: 2018
Publisher: IEEE
Date: 08-2006
Publisher: Springer International Publishing
Date: 2018
Publisher: IEEE
Date: 05-2009
Publisher: IEEE
Date: 12-2009
Publisher: IEEE
Date: 12-2008
Publisher: IEEE
Date: 12-2018
Publisher: IEEE
Date: 12-2010
Publisher: IEEE
Date: 2006
Publisher: Elsevier BV
Date: 04-2002
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2022
Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)
Date: 05-2019
Publisher: Association for Computing Machinery (ACM)
Date: 02-11-2007
Abstract: Rapid generation of high quality Gaussian random numbers is a key capability for simulations across a wide range of disciplines. Advances in computing have brought the power to conduct simulations with very large numbers of random numbers and with it, the challenge of meeting increasingly stringent requirements on the quality of Gaussian random number generators (GRNG). This article describes the algorithms underlying various GRNGs, compares their computational requirements, and examines the quality of the random numbers with emphasis on the behaviour in the tail region of the Gaussian probability density function.
Publisher: IEEE
Date: 06-2006
Publisher: IEEE
Date: 12-2009
Publisher: IEEE
Date: 2003
Publisher: IEEE
Date: 2003
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 07-2012
Publisher: IEEE
Date: 08-2007
Publisher: Informa UK Limited
Date: 17-05-2016
Publisher: IEEE
Date: 12-2007
Publisher: IEEE
Date: 12-2007
Publisher: IEEE
Date: 2003
Start Date: 12-2014
End Date: 06-2018
Amount: $200,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 01-2012
End Date: 07-2016
Amount: $375,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 2023
End Date: 12-2023
Amount: $1,465,519.00
Funder: Australian Research Council
View Funded Activity