ARDC Research Link Australia

Publication

Recognizing Cartoon Image Gestures for Retrieval and Interactive Cartoon Clip Synthesis

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2010

DOI: 10.1109/TCSVT.2010.2087452

Publication

Web Image Annotation Via Subspace-Sparsity Collaborated Feature Selection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2012

DOI: 10.1109/TMM.2012.2187179

Publication

Learning Latent Stable Patterns for Image Understanding With Weak and Noisy Labels

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2019

DOI: 10.1109/TCYB.2018.2861419

Publication

Semantic Pooling for Complex Event Analysis in Untrimmed Videos

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2017

DOI: 10.1109/TPAMI.2016.2608901

Publication

Recurrent Attention Network with Reinforced Generator for Visual Dialog

Publisher: Association for Computing Machinery (ACM)

Date: 05-07-2020

DOI: 10.1145/3390891

Abstract: In Visual Dialog, an agent has to parse temporal context in the dialog history and spatial context in the image to hold a meaningful dialog with humans. For ex le, to answer “what is the man on her left wearing?” the agent needs to (1) analyze the temporal context in the dialog history to infer who is being referred to as “her,” (2) parse the image to attend “her,” and (3) uncover the spatial context to shift the attention to “her left” and check the apparel of the man. In this article, we use a dialog network to memorize the temporal context and an attention processor to parse the spatial context. Since the question and the image are usually very complex, which makes it difficult for the question to be grounded with a single glimpse, the attention processor attends to the image multiple times to better collect visual information. In the Visual Dialog task, the generative decoder (G) is trained under the word-by-word paradigm, which suffers from the lack of sentence-level training. We propose to reinforce G at the sentence level using the discriminative model (D), which aims to select the right answer from a few candidates, to ameliorate the problem. Experimental results on the VisDial dataset demonstrate the effectiveness of our approach.

Publication

Discriminative Cellets Discovery for Fine-Grained Image Categories Retrieval

Publisher: ACM

Date: 04-2014

DOI: 10.1145/2578726.2578736

Publication

Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TIP.2022.3195642

Publication

Indexing of the CNN features for the large scale image search

Publisher: Springer Science and Business Media LLC

Date: 15-06-2018

DOI: 10.1007/S11042-018-6210-3

Publication

Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Publisher: Springer Science and Business Media LLC

Date: 03-02-2007

DOI: 10.1007/S11265-006-0020-Y

Publication

Heterogeneous multimedia data semantics mining using content and location context

Publisher: ACM

Date: 26-10-2008

DOI: 10.1145/1459359.1459452

Publication

DerainCycleGAN: Rain Attentive CycleGAN for Single Image Deraining and Rainmaking

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3074804

Publication

DMRNet++: Learning Discriminative Features With Decoupled Networks and Enriched Pairs for One-Step Person Search

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2023

DOI: 10.1109/TPAMI.2022.3221079

Publication

Sparse Multi-Modal Hashing

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2014

DOI: 10.1109/TMM.2013.2291214

Publication

Context Matters: Distilling Knowledge Graph for Enhanced Object Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TMM.2023.3266897

Publication

Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video Retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2020.3048680

Publication

IcoCap: Improving Video Captioning by Compounding Images

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TMM.2023.3322329

Publication

Learning to predict health status of geriatric patients from observational data

Publisher: IEEE

Date: 05-2012

DOI: 10.1109/CIBCB.2012.6217221

Publication

Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2008

DOI: 10.1109/TMM.2007.911822

Publication

DevNet: A Deep Event Network for multimedia event detection and evidence recounting

Publisher: IEEE

Date: 06-2015

DOI: 10.1109/CVPR.2015.7298872

Publication

Weakly Supervised Human Fixations Prediction

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2016

DOI: 10.1109/TCYB.2015.2400821

Publication

Few-Shot Object Recognition from Machine-Labeled Web Images

Publisher: IEEE

Date: 07-2017

DOI: 10.1109/CVPR.2017.569

Publication

Bayesian query expansion for multi-camera person re-identification

Publisher: Elsevier BV

Date: 02-2020

DOI: 10.1016/J.PATREC.2018.06.009

Publication

Dynamic Affinity Graph Construction for Spectral Clustering Using Multiple Features.

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2018

DOI: 10.1109/TNNLS.2018.2829867

Publication

Complex Event Detection via Multi-source Video Attributes

Publisher: IEEE

Date: 06-2013

DOI: 10.1109/CVPR.2013.339

Publication

Hierarchical Temporal Modeling With Mutual Distance Matching for Video Based Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2021

DOI: 10.1109/TCSVT.2020.2988034

Publication

U-Turn: Crafting Adversarial Queries with Opposite-Direction Features

Publisher: Springer Science and Business Media LLC

Date: 20-12-2022

DOI: 10.1007/S11263-022-01737-Y

Publication

Pose-Invariant Embedding for Deep Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2019

DOI: 10.1109/TIP.2019.2910414

Publication

More is Less: A More Complicated Network with Less Inference Complexity

Publisher: IEEE

Date: 07-2017

DOI: 10.1109/CVPR.2017.205

Publication

Learning frame relevance for video classification

Publisher: ACM

Date: 28-11-2011

DOI: 10.1145/2072298.2072011

Publication

Filter Pruning by Switching to Neighboring CNNs With Good Attributes

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2023

DOI: 10.1109/TNNLS.2022.3149332

Publication

Local image tagging via graph regularized joint group sparsity

Publisher: Elsevier BV

Date: 05-2013

DOI: 10.1016/J.PATCOG.2012.10.026

Publication

Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising

Publisher: IEEE

Date: 06-2014

DOI: 10.1109/CVPR.2014.377

Publication

Understanding Atomic Hand-Object Interaction With Human Intention

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TCSVT.2021.3058688

Publication

3D human pose recovery from image by efficient visual feature selection

Publisher: Elsevier BV

Date: 03-2011

DOI: 10.1016/J.CVIU.2010.11.007

Publication

Rank-Constrained Spectral Clustering With Flexible Embedding

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2018

DOI: 10.1109/TNNLS.2018.2817538

Publication

Adaptive Unsupervised Feature Selection With Structure Regularization

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2018

DOI: 10.1109/TNNLS.2017.2650978

Publication

Late Fusion via Subspace Search With Consistency Preservation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2019

DOI: 10.1109/TIP.2018.2867747

Publication

Deep Top-$k$ Ranking for Image–Sentence Matching

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2020

DOI: 10.1109/TMM.2019.2931352

Publication

Semi-Supervised Multiple Feature Analysis for Action Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2014

DOI: 10.1109/TMM.2013.2293060

Publication

Infrared Patch-Image Model for Small Target Detection in a Single Image

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2013

DOI: 10.1109/TIP.2013.2281420

Publication

Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2022

DOI: 10.1109/TCYB.2021.3130047

Publication

Semisupervised feature analysis by mining correlations among multiple tasks

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2016

DOI: 10.1109/TNNLS.2016.2582746

Publication

Fusing Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2018

DOI: 10.1109/TMM.2018.2802648

Publication

Ranking with local regression and global alignment for cross media retrieval

Publisher: ACM

Date: 19-10-2009

DOI: 10.1145/1631272.1631298

Publication

Person Re-identification in the Wild

Publisher: IEEE

Date: 07-2017

DOI: 10.1109/CVPR.2017.357

Publication

Classification by semi-supervised discriminative regularization

Publisher: Elsevier BV

Date: 06-2010

DOI: 10.1016/J.NEUCOM.2009.11.040

Publication

Personal health indexing based on medical examinations: A data mining approach

Publisher: Elsevier BV

Date: 2016

DOI: 10.1016/J.DSS.2015.10.008

Publication

Transfer tagging from image to video

Publisher: ACM

Date: 28-11-2011

DOI: 10.1145/2072298.2071958

Publication

Effective multiple feature hashing for large-scale near-duplicate video retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2013

DOI: 10.1109/TMM.2013.2271746

Publication

Image Classification by Cross-Media Active Learning With Privileged Information

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2016

DOI: 10.1109/TMM.2016.2602938

Publication

Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization

Publisher: IEEE

Date: 06-2013

DOI: 10.1109/CVPR.2013.476

Publication

FastShrinkage

Publisher: ACM

Date: 19-10-2017

DOI: 10.1145/3123266.3123377

Publication

Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few Exemplars

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2014

DOI: 10.1109/TPAMI.2014.2306419

Publication

Few-shot text and image classification via analogical transfer learning

Publisher: Association for Computing Machinery (ACM)

Date: 29-10-2018

DOI: 10.1145/3230709

Abstract: Learning from very few s les is a challenge for machine learning tasks, such as text and image classification. Performance of such task can be enhanced via transfer of helpful knowledge from related domains, which is referred to as transfer learning. In previous transfer learning works, instance transfer learning algorithms mostly focus on selecting the source domain instances similar to the target domain instances for transfer. However, the selected instances usually do not directly contribute to the learning performance in the target domain. Hypothesis transfer learning algorithms focus on the model arameter level transfer. They treat the source hypotheses as well-trained and transfer their knowledge in terms of parameters to learn the target hypothesis. Such algorithms directly optimize the target hypothesis by the observable performance improvements. However, they fail to consider the problem that instances that contribute to the source hypotheses may be harmful for the target hypothesis, as instance transfer learning analyzed. To relieve the aforementioned problems, we propose a novel transfer learning algorithm, which follows an analogical strategy. Particularly, the proposed algorithm first learns a revised source hypothesis with only instances contributing to the target hypothesis. Then, the proposed algorithm transfers both the revised source hypothesis and the target hypothesis (only trained with a few s les) to learn an analogical hypothesis. We denote our algorithm as Analogical Transfer Learning. Extensive experiments on one synthetic dataset and three real-world benchmark datasets demonstrate the superior performance of the proposed algorithm.

Publication

Delay-Aware Quality Optimization in Cloud-Assisted Video Streaming System

Publisher: Association for Computing Machinery (ACM)

Date: 13-12-2017

DOI: 10.1145/3152116

Abstract: Cloud-assisted video streaming has emerged as a new paradigm to optimize multimedia content distribution over the Internet. This article investigates the problem of streaming cloud-assisted real-time video to multiple destinations (e.g., cloud video conferencing, multi-player cloud gaming, etc.) over lossy communication networks. The user ersity and network dynamics result in the delay differences among multiple destinations. This research proposes underline D /underline ifferentiated cloud- underline A /underline ssisted underline VI /underline deo underline S /underline treaming (DAVIS) framework, which proactively leverages such delay differences in video coding and transmission optimization. First, we analytically formulate the optimization problem of joint coding and transmission to maximize received video quality. Second, we develop a quality optimization framework that integrates the video representation selection and FEC (Forward Error Correction) packet interleaving. The proposed DAVIS is able to effectively perform differentiated quality optimization for multiple destinations by taking advantage of the delay differences in cloud-assisted video streaming system. We conduct the performance evaluation through extensive experiments with the Amazon EC2 instances and Exata emulation platform. Evaluation results show that DAVIS outperforms the reference cloud-assisted streaming solutions in video quality and delay performance.

Publication

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TPAMI.2023.3274139

Publication

Multiple feature hashing for real-time large scale near-duplicate video retrieval

Publisher: ACM

Date: 28-11-2011

DOI: 10.1145/2072298.2072354

Publication

Image Attribute Adaptation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2014

DOI: 10.1109/TMM.2014.2306092

Publication

RCAA: Relational context-aware agents for person search

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01240-3_6

Publication

Avoiding optimal mean ℓ2,1-norm maximization-based robust PCA for reconstruction

Publisher: MIT Press - Journals

Date: 04-2017

DOI: 10.1162/NECO_A_00937

Abstract: Robust principal component analysis (PCA) is one of the most important dimension-reduction techniques for handling high-dimensional data with outliers. However, most of the existing robust PCA presupposes that the mean of the data is zero and incorrectly utilizes the average of data as the optimal mean of robust PCA. In fact, this assumption holds only for the squared [Formula: see text]-norm-based traditional PCA. In this letter, we equivalently reformulate the objective of conventional PCA and learn the optimal projection directions by maximizing the sum of projected difference between each pair of instances based on [Formula: see text]-norm. The proposed method is robust to outliers and also invariant to rotation. More important, the reformulated objective not only automatically avoids the calculation of optimal mean and makes the assumption of centered data unnecessary, but also theoretically connects to the minimization of reconstruction error. To solve the proposed nonsmooth problem, we exploit an efficient optimization algorithm to soften the contributions from outliers by reweighting each data point iteratively. We theoretically analyze the convergence and computational complexity of the proposed algorithm. Extensive experimental results on several benchmark data sets illustrate the effectiveness and superiority of the proposed method.

Publication

Holistic LSTM for Pedestrian Trajectory Prediction

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3058599

Publication

Towards efficient search for activity trajectories

Publisher: IEEE

Date: 04-2013

DOI: 10.1109/ICDE.2013.6544828

Publication

Learning Part-based Convolutional Features for Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2021

DOI: 10.1109/TPAMI.2019.2938523

Publication

Image Clustering Using Local Discriminant Models and Global Integration

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2010

DOI: 10.1109/TIP.2010.2049235

Publication

Dark Knowledge Balance Learning for Unbiased Scene Graph Generation

Publisher: ACM

Date: 26-10-2023

DOI: 10.1145/3581783.3612031

Publication

Monitoring and Coaching the Use of Home Medical Devices

Publisher: Springer International Publishing

Date: 2015

DOI: 10.1007/978-3-319-17963-6_14

Publication

Bi-level semantic representation analysis for multimedia event detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2017

DOI: 10.1109/TCYB.2016.2539546

Publication

Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning

Publisher: ACM

Date: 19-10-2009

DOI: 10.1145/1631272.1631316

Publication

Pedestrian Alignment Network for Large-scale Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2019

DOI: 10.1109/TCSVT.2018.2873599

Publication

Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2022

DOI: 10.1109/TCSVT.2021.3061265

Publication

Multi-feature fusion via hierarchical regression for multimedia analysis

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2013

DOI: 10.1109/TMM.2012.2234731

Publication

Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TIP.2022.3175601

Publication

A Differentiable Parallel Sampler for Efficient Video Classification

Publisher: Association for Computing Machinery (ACM)

Date: 25-02-2023

DOI: 10.1145/3569584

Abstract: It is crucial to s le a small portion of relevant frames for efficient video classification. The existing methods mainly develop hand-designed s ling strategies or learn sequential selection policies. However, there are two challenges to be solved. First, hand-designed s ling strategies are intrinsically non-adaptive to different video backbones. Second, sequential frame selection policies ignore temporal relations among all video frames. The sequential selection process also hinders the application of these video s lers in speed-critical systems. In this article, we propose a differentiable parallel video s ling network (PSN) to tackle the aforementioned challenges, First, we optimize the video s ler with a differentiable surrogate loss, allowing to dynamically learn the s ler with the cooperation from the video classification model. Our s ler considers the feedback from all frames jointly, eliminating the learning difficulties of sequential decision making. The learning process is fully gradient-based, making the s ler be learned efficiently. Our video s ler can assess a set of frames swiftly and determine the importance of each frame in parallel. Second, we propose to model the inter-relation among contextual frames, which encourages the s ler to select frames based on a comprehensive inspection of the entire video. We observe that a simple context relation mining instantiation would significantly improve the classification performance. The experimental results on three standard video recognition benchmarks demonstrate the efficacy and efficiency of our framework.

Publication

Few-Example Object Detection with Model Communication

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2019

DOI: 10.1109/TPAMI.2018.2844853

Publication

A discriminative CNN video representation for event detection

Publisher: IEEE

Date: 06-2015

DOI: 10.1109/CVPR.2015.7298789

Publication

Convolutional Reconstruction-to-Sequence for Video Captioning

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2020

DOI: 10.1109/TCSVT.2019.2956593

Publication

Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro

Publisher: IEEE

Date: 10-2017

DOI: 10.1109/ICCV.2017.405

Publication

Point Spatio-Temporal Transformer Networks for Point Cloud Video Modeling

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2023

DOI: 10.1109/TPAMI.2022.3161735

Publication

Discovering Discriminative Graphlets for Aerial Image Categories Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2013

DOI: 10.1109/TIP.2013.2278465

Publication

Guest Editorial: Special issue on large scale multimedia semantic indexing

Publisher: Elsevier BV

Date: 07-2014

DOI: 10.1016/J.CVIU.2014.05.003

Publication

Weakly Supervised Multilabel Clustering and its Applications in Computer Vision

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2016

DOI: 10.1109/TCYB.2015.2501385

Publication

Cascaded Revision Network for Novel Object Captioning

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2020

DOI: 10.1109/TCSVT.2020.2965966

Publication

Revisiting EmbodiedQA: A Simple Baseline and Beyond

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2020

DOI: 10.1109/TIP.2020.2967584

Publication

Two-Stream Multirate Recurrent Neural Network for Video-Based Pedestrian Reidentification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2018

DOI: 10.1109/TII.2017.2767557

Publication

Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity

Publisher: Elsevier BV

Date: 03-2021

DOI: 10.1016/J.CELL.2021.01.037

Publication

Space-Time Robust Representation for Action Recognition

Publisher: IEEE

Date: 12-2013

DOI: 10.1109/ICCV.2013.336

Publication

Exploiting the entire feature space with sparsity for automatic image annotation

Publisher: ACM

Date: 28-11-2011

DOI: 10.1145/2072298.2072336

Publication

IDE: Image Dehazing and Exposure Using an Enhanced Atmospheric Scattering Model

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3050643

Publication

Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2015

DOI: 10.1109/TNNLS.2014.2314123

Publication

Fast and Low Memory Cost Matrix Factorization: Algorithm, Analysis and Case Study

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2020

DOI: 10.1109/TKDE.2018.2882197

Publication

How Related Exemplars Help Complex Event Detection in Web Videos?

Publisher: IEEE

Date: 12-2013

DOI: 10.1109/ICCV.2013.456

Publication

E-LAMP: integration of innovative ideas for multimedia event detection

Publisher: Springer Science and Business Media LLC

Date: 09-07-2013

DOI: 10.1007/S00138-013-0529-6

Publication

Video Pivoting Unsupervised Multi-Modal Machine Translation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TPAMI.2022.3181116

Publication

Online compressed robust PCA

Publisher: IEEE

Date: 05-2017

DOI: 10.1109/IJCNN.2017.7965967

Publication

Combining location and feature information for multimedia retrieval

Publisher: Inderscience Publishers

Date: 2010

DOI: 10.1504/IJCAT.2010.034136

Publication

Uncovering the Temporal Context for Video Question Answering

Publisher: Springer Science and Business Media LLC

Date: 13-07-2017

DOI: 10.1007/S11263-017-1033-7

Publication

Unified Dictionary Learning and Region Tagging with Hierarchical Sparse Representation

Publisher: Elsevier BV

Date: 08-2013

DOI: 10.1016/J.CVIU.2013.03.004

Publication

Collaborative Content-Dependent Modeling: A Return to the Roots of Salient Object Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TIP.2023.3293759

Publication

Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TNNLS.2022.3213563

Publication

Retrieval-based cartoon gesture recognition and applications via semi-supervised heterogeneous classifiers learning

Publisher: Elsevier BV

Date: 2013

DOI: 10.1016/J.PATCOG.2012.06.025

Publication

Graph Structure Fusion for Multiview Clustering

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2019

DOI: 10.1109/TKDE.2018.2872061

Publication

Guest Editorial: Ad Hoc Web Multimedia Analysis with Limited Supervision

Publisher: Springer Science and Business Media LLC

Date: 2015

DOI: 10.1007/S11042-014-2419-Y

Publication

Convex Sparse PCA for Unsupervised Feature Learning

Publisher: Association for Computing Machinery (ACM)

Date: 20-07-2016

DOI: 10.1145/2910585

Abstract: Principal component analysis (PCA) has been widely applied to dimensionality reduction and data pre-processing for different applications in engineering, biology, social science, and the like. Classical PCA and its variants seek for linear projections of the original variables to obtain the low-dimensional feature representations with maximal variance. One limitation is that it is difficult to interpret the results of PCA. Besides, the classical PCA is vulnerable to certain noisy data. In this paper, we propose a Convex Sparse Principal Component Analysis (CSPCA) algorithm and apply it to feature learning. First, we show that PCA can be formulated as a low-rank regression optimization problem. Based on the discussion, the l 2, 1 -normminimization is incorporated into the objective function to make the regression coefficients sparse, thereby robust to the outliers. Also, based on the sparse model used in CSPCA, an optimal weight is assigned to each of the original feature, which in turn provides the output with good interpretability. With the output of our CSPCA, we can effectively analyze the importance of each feature under the PCA criteria. Our new objective function is convex, and we propose an iterative algorithm to optimize it. We apply the CSPCA algorithm to feature selection and conduct extensive experiments on seven benchmark datasets. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art unsupervised feature selection algorithms.

Publication

Weakly Supervised Moment Localization with Decoupled Consistent Concept Prediction

Publisher: Springer Science and Business Media LLC

Date: 19-03-2022

DOI: 10.1007/S11263-022-01600-0

Publication

IDBP: Image Dehazing Using Blended Priors Including Non-Local, Local, and Global Priors

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2022

DOI: 10.1109/TCSVT.2021.3101503

Publication

Bag-of-Discriminative-Words (BoDW) Representation via Topic Modeling

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2017

DOI: 10.1109/TKDE.2017.2658571

Publication

Temporal Cross-Layer Correlation Mining for Action Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2021.3057503

Publication

Self-Correction for Human Parsing

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2022

DOI: 10.1109/TPAMI.2020.3048039

Publication

The Many Shades of Negativity

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2017

DOI: 10.1109/TMM.2017.2659221

Publication

Align and Tell: Boosting Text-Video Retrieval With Local Alignment and Fine-Grained Supervision

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2022.3204444

Publication

Discriminative Feature Learning for Thorax Disease Classification in Chest X-ray Images

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3052711

Publication

A Discriminatively Learned CNN Embedding for Person Reidentification

Publisher: Association for Computing Machinery (ACM)

Date: 13-12-2017

DOI: 10.1145/3159171

Abstract: In this article, we revisit two popular convolutional neural networks in person re-identification (re-ID): verification and identification models. The two models have their respective advantages and limitations due to different loss functions. Here, we shed light on how to combine the two models to learn more discriminative pedestrian descriptors. Specifically, we propose a Siamese network that simultaneously computes the identification loss and verification loss. Given a pair of training images, the network predicts the identities of the two input images and whether they belong to the same identity. Our network learns a discriminative embedding and a similarity measurement at the same time, thus taking full usage of the re-ID annotations. Our method can be easily applied on different pretrained networks. Albeit simple, the learned embedding improves the state-of-the-art performance on two public person re-ID benchmarks. Further, we show that our architecture can also be applied to image retrieval. The code is available at ayumi/2016_person_re-ID.

Publication

Compound Rank-k Projections for Bilinear Analysis

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2016

DOI: 10.1109/TNNLS.2015.2441735

Publication

Active learning for social image retrieval using Locally Regressive Optimal Design

Publisher: Elsevier BV

Date: 10-2012

DOI: 10.1016/J.NEUCOM.2011.06.037

Publication

Boosting Cross-Media Retrieval by Learning with Positive and Negative Examples

Publisher: Springer Berlin Heidelberg

Date: 2006

DOI: 10.1007/978-3-540-69429-8_17

Publication

A Dual-Network Progressive Approach to Weakly Supervised Object Detection

Publisher: ACM

Date: 19-10-2017

DOI: 10.1145/3123266.3123455

Publication

Identifying Visible Parts via Pose Estimation for Occluded Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2022

DOI: 10.1109/TNNLS.2021.3059515

Publication

We are not equally negative

Publisher: ACM

Date: 21-10-2013

DOI: 10.1145/2502081.2502119

Publication

Cross-media retrieval using query dependent search methods

Publisher: Elsevier BV

Date: 08-2010

DOI: 10.1016/J.PATCOG.2010.02.015

Publication

Multiview Consensus Graph Clustering

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2019

DOI: 10.1109/TIP.2018.2877335

Publication

Instance as Identity: A Generic Online Paradigm for Video Instance Segmentation

Publisher: Springer Nature Switzerland

Date: 2022

DOI: 10.1007/978-3-031-19818-2_30

Publication

Unsupervised Person Re-identification via Cross-Camera Similarity Exploration

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2020

DOI: 10.1109/TIP.2020.2982826

Publication

Compact and Discriminative Descriptor Inference Using Multi-Cues

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2015

DOI: 10.1109/TIP.2015.2479917

Publication

Compound Memory Networks for Few-Shot Video Classification

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01234-2_46

Publication

Learning With Noisy Labels via Self-Reweighting From Class Centroids

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2022

DOI: 10.1109/TNNLS.2021.3073248

Publication

Dynamic Slimmable Denoising Network

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TIP.2023.3246792

Publication

Content-Based Video Search over 1 Million Videos with 1 Core in 1 Second

Publisher: ACM

Date: 22-06-2015

DOI: 10.1145/2671188.2749398

Publication

Pyramidal Multiple Instance Detection Network With Mask Guided Self-Correction for Weakly Supervised Object Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3056887

Publication

Unsupervised Visual Representation Learning via Dual-Level Progressive Similar Instance Selection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2022

DOI: 10.1109/TCYB.2021.3054978

Publication

On the influence propagation of web videos

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2014

DOI: 10.1109/TKDE.2013.142

Publication

Guest editorial: web multimedia semantic inference using multi-cues

Publisher: Springer Science and Business Media LLC

Date: 23-06-2015

DOI: 10.1007/S11280-015-0360-2

Publication

Progressive Learning for Person Re-Identification With One Example

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2019

DOI: 10.1109/TIP.2019.2891895

Publication

Contrastive Adaptation Network for Single- and Multi-Source Domain Adaptation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2022

DOI: 10.1109/TPAMI.2020.3029948

Publication

Learning a 3D Human Pose Distance Metric from Geometric Pose Descriptor

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2011

DOI: 10.1109/TVCG.2010.272

Publication

Partial Alignment for Object Detection in the Wild

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2022

DOI: 10.1109/TCSVT.2021.3138851

Publication

Event Oriented Dictionary Learning for Complex Event Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2015

DOI: 10.1109/TIP.2015.2413294

Publication

SIFT Meets CNN: A Decade Survey of Instance Retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2018

DOI: 10.1109/TPAMI.2017.2709749

Publication

Classifier-specific intermediate representation for multimedia tasks

Publisher: ACM

Date: 05-06-2012

DOI: 10.1145/2324796.2324854

Publication

Robust top-k multiclass SVM for visual category recognition

Publisher: ACM

Date: 04-08-2017

DOI: 10.1145/3097983.3097991

Publication

Switchable Novel Object Captioner

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TPAMI.2022.3144984

Publication

Understanding Multimedia Document Semantics for Cross-Media Retrieval

Publisher: Springer Berlin Heidelberg

Date: 2005

DOI: 10.1007/11581772_87

Publication

Query-Efficient Black-Box Adversarial Attack With Customized Iteration and Sampling

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2023

DOI: 10.1109/TPAMI.2022.3169802

Publication

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2023

DOI: 10.1109/TPAMI.2020.3015894

Publication

Feature Weighting via Optimal Thresholding for Video Analysis

Publisher: IEEE

Date: 12-2013

DOI: 10.1109/ICCV.2013.427

Publication

Differentiable Multi-Granularity Human Parsing

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TPAMI.2023.3239194

Publication

Sketch-Guided Scenery Image Outpainting

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3054477

Publication

Identifying Objective and Subjective Words via Topic Modeling

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2018

DOI: 10.1109/TNNLS.2016.2626379

Publication

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TIP.2021.3139232

Publication

SemGloVe: Semantic Co-Occurrences for GloVe From BERT

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TASLP.2022.3197316

Publication

Self-produced Guidance for Weakly-Supervised Object Localization

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01258-8_37

Publication

Augmenting Image Descriptions Using <newline/>Structured Prediction Output

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2014

DOI: 10.1109/TMM.2014.2321530

Publication

A Probabilistic Associative Model for Segmenting Weakly Supervised Images

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2014

DOI: 10.1109/TIP.2014.2344433

Publication

Cross-media relevance mining for evaluating text-based image search engine

Publisher: IEEE

Date: 07-2014

DOI: 10.1109/ICMEW.2014.6890606

Publication

VehicleNet: Learning Robust Visual Representation for Vehicle Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TMM.2020.3014488

Publication

Event Detection Using Multi-level Relevance Labels and Multiple Features

Publisher: IEEE

Date: 06-2014

DOI: 10.1109/CVPR.2014.20

Publication

ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

Publisher: Association for Computing Machinery (ACM)

Date: 28-02-2021

DOI: 10.1145/3418214

Abstract: Quick response (QR) codes are usually scanned in different environments, so they must be robust to variations in illumination, scale, coverage, and camera angles. Aesthetic QR codes improve the visual quality, but subtle changes in their appearance may cause scanning failure. In this article, a new method to generate scanning-robust aesthetic QR codes is proposed, which is based on a module-based scanning probability estimation model that can effectively balance the tradeoff between visual quality and scanning robustness. Our method locally adjusts the luminance of each module by estimating the probability of successful s ling. The approach adopts the hierarchical, coarse-to-fine strategy to enhance the visual quality of aesthetic QR codes, which sequentially generate the following three codes: a binary aesthetic QR code, a grayscale aesthetic QR code, and the final color aesthetic QR code. Our approach also can be used to create QR codes with different visual styles by adjusting some initialization parameters. User surveys and decoding experiments were adopted for evaluating our method compared with state-of-the-art algorithms, which indicates that the proposed approach has excellent performance in terms of both visual quality and scanning robustness.

Publication

Robust cross-media transfer for visual event detection

Publisher: ACM

Date: 29-10-2012

DOI: 10.1145/2393347.2396379

Publication

Tag localization with spatial correlations and joint group sparsity

Publisher: IEEE

Date: 06-2011

DOI: 10.1109/CVPR.2011.5995499

Publication

PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation

Publisher: ACM

Date: 26-10-2023

DOI: 10.1145/3581783.3611708

Publication

SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2020

DOI: 10.1109/TCYB.2020.2992433

Publication

Fine-Grained Image Categorization by Localizing TinyObject Parts from Unannotated Images

Publisher: ACM

Date: 22-06-2015

DOI: 10.1145/2671188.2749299

Publication

Large-Scale Geosocial Multimedia [Guest editorial]

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2014

DOI: 10.1109/MMUL.2014.43

Publication

Decoupled Cross-Scale Cross-View Interaction for Stereo Image Enhancement in the Dark

Publisher: ACM

Date: 26-10-2023

DOI: 10.1145/3581783.3611962

Publication

Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2020

DOI: 10.1109/TCYB.2019.2933477

Publication

Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and Matching

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2021.3073624

Publication

Cross-media hashing with kernel regression

Publisher: IEEE

Date: 07-2014

DOI: 10.1109/ICME.2014.6890264

Publication

Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2021.3050069

Publication

Strategies for searching video content with text queries or video examples

Publisher: Institute of Image Information and Television Engineers

Date: 2016

DOI: 10.3169/MTA.4.227

Publication

Spline Regression Hashing for Fast Image Search

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2012

DOI: 10.1109/TIP.2012.2207394

Publication

A plasmid DNA-launched SARS-CoV-2 reverse genetics system and coronavirus toolkit for COVID-19 research

Publisher: Public Library of Science (PLoS)

Date: 25-02-2021

DOI: 10.1371/JOURNAL.PBIO.3001091

Abstract: The recent emergence of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the underlying cause of Coronavirus Disease 2019 (COVID-19), has led to a worldwide pandemic causing substantial morbidity, mortality, and economic devastation. In response, many laboratories have redirected attention to SARS-CoV-2, meaning there is an urgent need for tools that can be used in laboratories unaccustomed to working with coronaviruses. Here we report a range of tools for SARS-CoV-2 research. First, we describe a facile single plasmid SARS-CoV-2 reverse genetics system that is simple to genetically manipulate and can be used to rescue infectious virus through transient transfection (without in vitro transcription or additional expression plasmids). The rescue system is accompanied by our panel of SARS-CoV-2 antibodies (against nearly every viral protein), SARS-CoV-2 clinical isolates, and SARS-CoV-2 permissive cell lines, which are all openly available to the scientific community. Using these tools, we demonstrate here that the controversial ORF10 protein is expressed in infected cells. Furthermore, we show that the promising repurposed antiviral activity of apilimod is dependent on TMPRSS2 expression. Altogether, our SARS-CoV-2 toolkit, which can be directly accessed via our website at mrcppu-covid.bio/ , constitutes a resource with considerable potential to advance COVID-19 vaccine design, drug testing, and discovery science.

Publication

Rich Embedding Features for One-Shot Semantic Segmentation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2022

DOI: 10.1109/TNNLS.2021.3081693

Publication

Cross-modal Data Augmentation for Tasks of Different Modalities

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2022.3228696

Publication

Fast and Accurate Content-based Semantic Search in 100M Internet Videos

Publisher: ACM

Date: 13-10-2015

DOI: 10.1145/2733373.2806237

Publication

Divide and Retain: A Dual-Phase Modeling for Long-Tailed Visual Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TNNLS.2023.3269907

Publication

CamStyle: A Novel Data Augmentation Method for Person Re-Identification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2019

DOI: 10.1109/TIP.2018.2874313

Publication

Progressive Transfer Learning for Face Anti-Spoofing

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3066912

Publication

Generalizing a Person Retrieval Model Hetero- and Homogeneously

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01261-8_11

Publication

Person Reidentification via Multi-Feature Fusion With Adaptive Graph Learning

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2020

DOI: 10.1109/TNNLS.2019.2920905

Publication

Action Keypoint Network for Efficient Video Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TIP.2022.3191461

Publication

Twitter100k: A Real-World Dataset for Weakly Supervised Cross-Media Retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2018

DOI: 10.1109/TMM.2017.2760101

Publication

Interactive Video Indexing With Statistical Active Learning

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2012

DOI: 10.1109/TMM.2011.2174782

Publication

A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2012

DOI: 10.1109/TPAMI.2011.170

Publication

Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TMM.2022.3206664

Publication

Discriminative Nonnegative Spectral Clustering with Out-of-Sample Extension

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2013

DOI: 10.1109/TKDE.2012.118

Publication

Human Activity Analysis for Geriatric Care in Nursing Homes

Publisher: Springer New York

Date: 03-08-2012

DOI: 10.1007/978-1-4614-3501-3_5

Publication

Robust Hashing With Local Models for Approximate Similarity Search

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2014

DOI: 10.1109/TCYB.2013.2289351

Publication

Regularized Deep Belief Network for Image Attribute Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2017

DOI: 10.1109/TCSVT.2016.2539604

Publication

AvatarFusion: Zero-shot Generation of Clothing-Decoupled 3D Avatars Using 2D Diffusion

Publisher: ACM

Date: 26-10-2023

DOI: 10.1145/3581783.3612022

Publication

Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2014

DOI: 10.1109/TKDE.2013.65

Publication

Adaptive Structure Discovery for Multimedia Analysis Using Multiple Features

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2018

DOI: 10.1109/TCYB.2018.2815012

Publication

Weakly Supervised Photo Cropping

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2014

DOI: 10.1109/TMM.2013.2286817

Publication

Recognizing an Action Using Its Name: A Knowledge-Based Approach

Publisher: Springer Science and Business Media LLC

Date: 02-03-2016

DOI: 10.1007/S11263-016-0893-6

Publication

Multimedia Event Detection Using A Classifier-Specific Intermediate Representation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 11-2013

DOI: 10.1109/TMM.2013.2264928

Publication

Learning Distilled Graph for Large-Scale Social Network Data Clustering

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2020

DOI: 10.1109/TKDE.2019.2904068

Publication

Harnessing Lab Knowledge for Real-World Action Recognition

Publisher: Springer Science and Business Media LLC

Date: 16-04-2014

DOI: 10.1007/S11263-014-0717-5

Publication

Few-Shot Common-Object Reasoning Using Common-Centric Localization Network

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3070733

Publication

CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation

Publisher: ACM

Date: 26-10-2023

DOI: 10.1145/3581783.3611724

Publication

Complex Event Detection by Identifying Reliable Shots from Untrimmed Videos

Publisher: IEEE

Date: 10-2017

DOI: 10.1109/ICCV.2017.86

Publication

Searching persuasively: Joint event detection and evidence recounting with limited supervision

Publisher: ACM

Date: 13-10-2015

DOI: 10.1145/2733373.2806218

Publication

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

Publisher: Springer Science and Business Media LLC

Date: 13-11-2015

DOI: 10.1007/S11263-014-0781-X

Publication

Beyond Doctors

Publisher: ACM

Date: 13-10-2015

DOI: 10.1145/2733373.2806217

Publication

Learning to Anticipate Egocentric Actions by Imagination

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2020.3040521

Publication

Dynamic background learning through deep auto-encoder networks

Publisher: ACM

Date: 03-11-2014

DOI: 10.1145/2647868.2654914

Publication

Macro-Micro Adversarial Network for Human Parsing

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01240-3_26

Publication

Show Me a Video: A Large-Scale Narrated Video Dataset for Coherent Story Illustration

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TMM.2023.3296944

Publication

Multiple Features But Few Labels?

Publisher: ACM

Date: 03-11-2014

DOI: 10.1145/2647868.2654907

Publication

Exploiting detected visual objects for frame-level video filtering

Publisher: Springer Science and Business Media LLC

Date: 27-10-2017

DOI: 10.1007/S11280-017-0505-6

Publication

Unsupervised Video Adaptation for Parsing Human Motion

Publisher: Springer International Publishing

Date: 2014

DOI: 10.1007/978-3-319-10602-1_23

Publication

Discriminative Orthogonal Nonnegative matrix factorization with flexibility for data representation

Publisher: Elsevier BV

Date: 03-2014

DOI: 10.1016/J.ESWA.2013.08.026

Publication

Parameter-Efficient Person Re-Identification in the 3D Space

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TNNLS.2022.3214834

Publication

Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TCSVT.2023.3282349

Publication

Special section on learning from multiple evidences for large scale multimedia analysis

Publisher: Elsevier BV

Date: 2014

DOI: 10.1016/J.CVIU.2013.09.001

Publication

Resource Constrained Multimedia Event Detection

Publisher: Springer International Publishing

Date: 2014

DOI: 10.1007/978-3-319-04114-8_33

Publication

Multitask Spectral Clustering by Exploring Intertask Correlation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 05-2015

DOI: 10.1109/TCYB.2014.2344015

Publication

An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 02-2018

DOI: 10.1109/TCYB.2017.2647904

Publication

FRC-Net: A Simple Yet Effective Architecture for Low-Light Image Enhancement

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TCE.2023.3280467

Publication

Guest editorial: Adaptation methods for multimedia analysis

Publisher: Elsevier BV

Date: 2016

DOI: 10.1016/J.NEUCOM.2015.07.082

Publication

Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation

Publisher: Springer Science and Business Media LLC

Date: 06-01-2021

DOI: 10.1007/S11263-020-01395-Y

Publication

Early Active Learning with Pairwise Constraint for Person Re-identification

Publisher: Springer International Publishing

Date: 2017

DOI: 10.1007/978-3-319-71249-9_7

Publication

Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2012

DOI: 10.1109/TIP.2011.2169269

Publication

Beyond trace ratio: Weighted harmonic mean of trace ratios for multiclass discriminant analysis

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2017

DOI: 10.1109/TKDE.2017.2728531

Publication

Co-Learning Meets Stitch-Up for Noisy Multi-Label Visual Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TIP.2023.3270103

Publication

Using Detected Visual Objects to Index Video Database

Publisher: Springer International Publishing

Date: 2016

DOI: 10.1007/978-3-319-46922-5_26

Publication

Training Robust Object Detectors From Noisy Category Labels and Imprecise Bounding Boxes

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2021

DOI: 10.1109/TIP.2021.3085208

Publication

Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline)

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01225-0_30

Publication

Data-Driven Answer Selection in Community QA Systems

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2017

DOI: 10.1109/TKDE.2017.2669982

Publication

On the Large-Scale Transferability of Convolutional Neural Networks

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-04503-6_3

Publication

Progressive Local Filter Pruning for Image Retrieval Acceleration

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TMM.2023.3256092

Publication

Cyclic Self-Training With Proposal Weight Modulation for Cross-Supervised Object Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TIP.2023.3261752

Publication

Local-Global Context Aware Transformer for Language-Guided Video Segmentation

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2023

DOI: 10.1109/TPAMI.2023.3262578

Publication

Tasks Integrated Networks: Joint Detection and Retrieval for Image Search

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2022

DOI: 10.1109/TPAMI.2020.3009758

Publication

Supervision by Registration and Triangulation for Landmark Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2021

DOI: 10.1109/TPAMI.2020.2983935

Publication

Unsupervised Person Re-identification

Publisher: Association for Computing Machinery (ACM)

Date: 10-10-2018

DOI: 10.1145/3243316

Abstract: The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this article, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between (1) pedestrian clustering and (2) fine-tuning of the convolutional neural network (CNN) to improve the initialization model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning, when the model is weak, CNN is fine-tuned on a small amount of reliable ex les that locate near to cluster centroids in the feature space. As the model becomes stronger, in subsequent iterations, more images are being adaptively selected as CNN training s les. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy. Our code has been released at ehefan/Unsupervised-Person-Re-identification-Clustering-and-Fine-tuning.

Publication

Weakly supervised sparse coding with geometric consistency pooling

Publisher: IEEE

Date: 06-2012

DOI: 10.1109/CVPR.2012.6248102

Publication

Feature Interaction Augmented Sparse Learning for Fast Kinect Motion Detection

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 08-2017

DOI: 10.1109/TIP.2017.2708506

Publication

Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2013

DOI: 10.1109/TMM.2012.2237023

Publication

Active Learning for Deep Visual Tracking

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TNNLS.2023.3266837

Publication

Hierarchical Memory Decoder for Visual Narrating

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 06-2021

DOI: 10.1109/TCSVT.2020.3020877

Publication

Dual-path Convolutional Image-Text Embeddings with Instance Loss

Publisher: Association for Computing Machinery (ACM)

Date: 22-05-2020

DOI: 10.1145/3383184

Abstract: Matching images and sentences demands a fine understanding of both modalities. In this article, we propose a new system to discriminatively embed the image and text to a shared visual-textual space. In this field, most existing works apply the ranking loss to pull the positive image/text pairs close and push the negative pairs apart from each other. However, directly deploying the ranking loss on heterogeneous features (i.e., text and image features) is less effective, because it is hard to find appropriate triplets at the beginning. So the naive way of using the ranking loss may compromise the network from learning inter-modal relationship. To address this problem, we propose the instance loss, which explicitly considers the intra-modal data distribution. It is based on an unsupervised assumption that each image/text group can be viewed as a class. So the network can learn the fine granularity from every image/text group. The experiment shows that the instance loss offers better weight initialization for the ranking loss, so that more discriminative embeddings can be learned. Besides, existing works usually apply the off-the-shelf features, i.e., word2vec and fixed visual feature. So in a minor contribution, this article constructs an end-to-end dual-path convolutional network to learn the image and text representations. End-to-end learning allows the system to directly learn from the data and fully utilize the supervision. On two generic retrieval datasets (Flickr30k and MSCOCO), experiments demonstrate that our method yields competitive accuracy compared to state-of-the-art methods. Moreover, in language-based person retrieval, we improve the state of the art by a large margin. The code has been made publicly available.

Publication

Interactive Surveillance Event Detection through Mid-level Discriminative Representation

Publisher: ACM

Date: 04-2014

DOI: 10.1145/2578726.2578765

Publication

Point Adversarial Self-Mining: A Simple Method for Facial Expression Recognition

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2022

DOI: 10.1109/TCYB.2021.3085744

Publication

Temporal Pixel-Level Semantic Understanding Through the VSPW Dataset

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 09-2023

DOI: 10.1109/TPAMI.2023.3266023

Publication

Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 04-2008

DOI: 10.1109/TMM.2008.917359

Publication

Knowledge adaptation for ad hoc multimedia event detection with few exemplars

Publisher: ACM

Date: 29-10-2012

DOI: 10.1145/2393347.2393414

Publication

Discriminative coupled dictionary hashing for fast cross-media retrieval

Publisher: ACM

Date: 03-07-2014

DOI: 10.1145/2600428.2609563

Publication

Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: The Benefit of Target Expectation Maximization

Publisher: Springer International Publishing

Date: 2018

DOI: 10.1007/978-3-030-01252-6_25

Publication

Bilaterally Normalized Scale-Consistent Sinkhorn Distance for Few-Shot Image Classification

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2023

DOI: 10.1109/TNNLS.2023.3262351

Publication

Inter-media hashing for large-scale retrieval from heterogeneous data sources

Publisher: ACM

Date: 22-06-2013

DOI: 10.1145/2463676.2465274

Publication

Action recognition by exploring data distribution and feature correlation

Publisher: IEEE

Date: 06-2012

DOI: 10.1109/CVPR.2012.6247823

Publication

Exploiting Combination Effect for Unsupervised Feature Selection by <inline-formula> <tex-math notation=LaTeX>$\ell_{2,0}$ </tex-math> </

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 2019

DOI: 10.1109/TNNLS.2018.2837100

Publication

Discriminating Joint Feature Analysis for Multimedia Data Understanding

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 12-2012

DOI: 10.1109/TMM.2012.2199293

Publication

Saying the Unseen: Video Descriptions via Dialog Agents

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2022

DOI: 10.1109/TPAMI.2021.3093360

Publication

Aspect Learning for Multimedia Summarization via Nonparametric Bayesian

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 10-2016

DOI: 10.1109/TCSVT.2015.2477938

Publication

Overcoming Semantic drift in information extraction

Publisher: No publisher found

Date: 2014

DOI: 10.5441/002/EDBT.2014.16

Publication

Fall detection in multi-camera surveillance videos

Publisher: ACM

Date: 22-10-2013

DOI: 10.1145/2505323.2505331

Publication

Modality-Invariant Image-Text Embedding for Image-Sentence Matching

Publisher: Association for Computing Machinery (ACM)

Date: 07-02-2019

DOI: 10.1145/3300939

Abstract: Performing direct matching among different modalities (like image and text) can benefit many tasks in computer vision, multimedia, information retrieval, and information fusion. Most of existing works focus on class-level image-text matching, called cross-modal retrieval , which attempts to propose a uniform model for matching images with all types of texts, for ex le, tags, sentences, and articles (long texts). Although cross-model retrieval alleviates the heterogeneous gap among visual and textual information, it can provide only a rough correspondence between two modalities. In this article, we propose a more precise image-text embedding method, image-sentence matching, which can provide heterogeneous matching in the instance level. The key issue for image-text embedding is how to make the distributions of the two modalities consistent in the embedding space. To address this problem, some previous works on the cross-model retrieval task have attempted to pull close their distributions by employing adversarial learning. However, the effectiveness of adversarial learning on image-sentence matching has not been proved and there is still not an effective method. Inspired by previous works, we propose to learn a modality-invariant image-text embedding for image-sentence matching by involving adversarial learning. On top of the triplet loss--based baseline, we design a modality classification network with an adversarial loss, which classifies an embedding into either the image or text modality. In addition, the multi-stage training procedure is carefully designed so that the proposed network not only imposes the image-text similarity constraints by ground-truth labels, but also enforces the image and text embedding distributions to be similar by adversarial learning. Experiments on two public datasets (Flickr30k and MSCOCO) demonstrate that our method yields stable accuracy improvement over the baseline model and that our results compare favorably to the state-of-the-art methods.

Publication

Viral Video Style

Publisher: ACM

Date: 04-2014

DOI: 10.1145/2578726.2578754

Publication

Bidirectional Multirate Reconstruction for Temporal Modeling in Videos

Publisher: IEEE

Date: 07-2017

DOI: 10.1109/CVPR.2017.147

Publication

A cognitive assistive system for monitoring the use of home medical devices

Publisher: ACM

Date: 22-10-2013

DOI: 10.1145/2505323.2505334

Publication

Effective transfer tagging from image to video

Publisher: Association for Computing Machinery (ACM)

Date: 05-2013

DOI: 10.1145/2457450.2457456

Abstract: Recent years have witnessed a great explosion of user-generated videos on the Web. In order to achieve an effective and efficient video search, it is critical for modern video search engines to associate videos with semantic keywords automatically. Most of the existing video tagging methods can hardly achieve reliable performance due to deficiency of training data. It is noticed that abundant well-tagged data are available in other relevant types of media (e.g., images). In this article, we propose a novel video tagging framework, termed as Cross-Media Tag Transfer (CMTT), which utilizes the abundance of well-tagged images to facilitate video tagging. Specifically, we build a “cross-media tunnel” to transfer knowledge from images to videos. To this end, an optimal kernel space, in which distribution distance between images and video is minimized, is found to tackle the domain-shift problem. A novel cross-media video tagging model is proposed to infer tags by exploring the intrinsic local structures of both labeled and unlabeled data, and learn reliable video classifiers. An efficient algorithm is designed to optimize the proposed model in an iterative and alternative way. Extensive experiments illustrate the superiority of our proposal compared to the state-of-the-art algorithms.

Publication

Guest editors' introduction: Perception, Aesthetics, and Emotion in Multimedia Quality Modeling

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 07-2016

DOI: 10.1109/MMUL.2016.42

Publication

Personalized Video Recommendation Using Rich Contents from Videos

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Date: 03-2020

DOI: 10.1109/TKDE.2018.2885520

Yi Yang

Researcher

Research Topics

Top 5 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

Recognizing Cartoon Image Gestures for Retrieval and Interactive Cartoon Clip Synthesis

Web Image Annotation Via Subspace-Sparsity Collaborated Feature Selection

Learning Latent Stable Patterns for Image Understanding With Weak and Noisy Labels

Semantic Pooling for Complex Event Analysis in Untrimmed Videos

Recurrent Attention Network with Reinforced Generator for Visual Dialog

Discriminative Cellets Discovery for Fine-Grained Image Categories Retrieval

Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation

Indexing of the CNN features for the large scale image search

Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Heterogeneous multimedia data semantics mining using content and location context

DerainCycleGAN: Rain Attentive CycleGAN for Single Image Deraining and Rainmaking

DMRNet++: Learning Discriminative Features With Decoupled Networks and Enriched Pairs for One-Step Person Search

Sparse Multi-Modal Hashing

Context Matters: Distilling Knowledge Graph for Enhanced Object Detection

Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video Retrieval

IcoCap: Improving Video Captioning by Compounding Images

Learning to predict health status of geriatric patients from observational data

Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval

DevNet: A Deep Event Network for multimedia event detection and evidence recounting

Weakly Supervised Human Fixations Prediction

Few-Shot Object Recognition from Machine-Labeled Web Images

Bayesian query expansion for multi-camera person re-identification

Dynamic Affinity Graph Construction for Spectral Clustering Using Multiple Features.

Complex Event Detection via Multi-source Video Attributes

Hierarchical Temporal Modeling With Mutual Distance Matching for Video Based Person Re-Identification

U-Turn: Crafting Adversarial Queries with Opposite-Direction Features

Pose-Invariant Embedding for Deep Person Re-Identification

More is Less: A More Complicated Network with Less Inference Complexity

Learning frame relevance for video classification

Filter Pruning by Switching to Neighboring CNNs With Good Attributes

Local image tagging via graph regularized joint group sparsity

Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising

Understanding Atomic Hand-Object Interaction With Human Intention

3D human pose recovery from image by efficient visual feature selection

Rank-Constrained Spectral Clustering With Flexible Embedding

Adaptive Unsupervised Feature Selection With Structure Regularization

Late Fusion via Subspace Search With Consistency Preservation

Deep Top-$k$ Ranking for Image–Sentence Matching

Semi-Supervised Multiple Feature Analysis for Action Recognition

Infrared Patch-Image Model for Small Target Detection in a Single Image

Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying

Semisupervised feature analysis by mining correlations among multiple tasks

Fusing Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks

Ranking with local regression and global alignment for cross media retrieval

Person Re-identification in the Wild

Classification by semi-supervised discriminative regularization

Personal health indexing based on medical examinations: A data mining approach

Transfer tagging from image to video

Effective multiple feature hashing for large-scale near-duplicate video retrieval

Image Classification by Cross-Media Active Learning With Privileged Information

Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization

FastShrinkage

Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few Exemplars

Few-shot text and image classification via analogical transfer learning

Delay-Aware Quality Optimization in Cloud-Assisted Video Streaming System

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

Multiple feature hashing for real-time large scale near-duplicate video retrieval

Image Attribute Adaptation

RCAA: Relational context-aware agents for person search

Avoiding optimal mean ℓ2,1-norm maximization-based robust PCA for reconstruction

Holistic LSTM for Pedestrian Trajectory Prediction

Towards efficient search for activity trajectories

Learning Part-based Convolutional Features for Person Re-Identification

Image Clustering Using Local Discriminant Models and Global Integration

Dark Knowledge Balance Learning for Unbiased Scene Graph Generation

Monitoring and Coaching the Use of Home Medical Devices

Bi-level semantic representation analysis for multimedia event detection

Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning

Pedestrian Alignment Network for Large-scale Person Re-Identification

Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization

Multi-feature fusion via hierarchical regression for multimedia analysis

Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization