ORCID Profile
0000-0002-2706-5985
Current Organisation
The University of Canberra
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Information Systems | Pattern Recognition and Data Mining | Computer-Human Interaction
Expanding Knowledge in the Information and Computing Sciences | Mental Health |
Publisher: IEEE
Date: 06-2012
DOI: 10.1109/HSI.2012.16
Publisher: IEEE
Date: 10-2020
Publisher: IEEE
Date: 06-2019
Publisher: IEEE
Date: 10-2021
Publisher: Springer International Publishing
Date: 2021
Publisher: IEEE
Date: 06-2020
Publisher: Association for Computing Machinery (ACM)
Date: 31-01-2022
DOI: 10.1145/3505244
Abstract: Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks, e.g., Long short-term memory. Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities (e.g., images, videos, text, and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers, i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks (e.g., image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks (e.g., visual-question answering, visual reasoning, and visual grounding), video processing (e.g., activity recognition, video forecasting), low-level vision (e.g., image super-resolution, image enhancement, and colorization), and three-dimensional analysis (e.g., point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges toward the application of transformer models in computer vision.
Publisher: ACM
Date: 03-11-2017
Publisher: ACM
Date: 23-10-2017
Publisher: IEEE
Date: 10-2017
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2016
Publisher: IEEE
Date: 10-2019
Publisher: ACM
Date: 09-10-2023
Publisher: Elsevier BV
Date: 04-2016
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2020
Publisher: IEEE
Date: 06-2014
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 04-2015
Publisher: ACM
Date: 26-10-2023
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2020
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 07-2016
Publisher: IEEE
Date: 03-2018
Publisher: Elsevier BV
Date: 2023
Publisher: Springer International Publishing
Date: 2014
Publisher: IEEE
Date: 2013
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 09-2021
Publisher: Springer International Publishing
Date: 2020
Publisher: IEEE
Date: 10-2017
Publisher: Springer International Publishing
Date: 2020
Publisher: Elsevier BV
Date: 2016
Publisher: IEEE
Date: 10-2019
Publisher: Elsevier BV
Date: 02-2014
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 07-2014
Publisher: IEEE
Date: 05-2017
DOI: 10.1109/FG.2017.94
Publisher: IEEE
Date: 06-2021
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 10-2018
Publisher: Elsevier BV
Date: 02-2019
DOI: 10.1016/J.NEUNET.2018.09.009
Abstract: The big breakthrough on the ImageNet challenge in 2012 was partially due to the 'Dropout' technique used to avoid overfitting. Here, we introduce a new approach called 'Spectral Dropout' to improve the generalization ability of deep neural networks. We cast the proposed approach in the form of regular Convolutional Neural Network (CNN) weight layers using a decorrelation transform with fixed basis functions. Our spectral dropout method prevents overfitting by eliminating weak and 'noisy' Fourier domain coefficients of the neural network activations, leading to remarkably better results than the current regularization methods. Furthermore, the proposed is very efficient due to the fixed basis functions used for spectral transformation. In particular, compared to Dropout and Drop-Connect, our method significantly speeds up the network convergence rate during the training process (roughly ×2), with considerably higher neuron pruning rates (an increase of ∼30%). We demonstrate that the spectral dropout can also be used in conjunction with other regularization approaches resulting in additional performance gains.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 2022
Publisher: IEEE
Date: 07-2017
Publisher: IEEE
Date: 10-2021
Publisher: IEEE
Date: 12-2015
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 08-2018
Publisher: Springer Science and Business Media LLC
Date: 28-02-2017
Publisher: IEEE
Date: 06-2021
Publisher: IEEE
Date: 12-2012
Start Date: 10-2020
End Date: 10-2024
Amount: $425,613.00
Funder: Australian Research Council
View Funded ActivityStart Date: 10-2019
End Date: 12-2024
Amount: $380,000.00
Funder: Australian Research Council
View Funded Activity