Skip to main content

Showing 1–50 of 65 results for author: Kot, A C

  1. arXiv:2406.17349  [pdf, other

    cs.CR cs.CV

    Semantic Deep Hiding for Robust Unlearnable Examples

    Authors: Ruohan Meng, Chenyu Yi, Yi Yu, Siyuan Yang, Bingquan Shen, Alex C. Kot

    Abstract: Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. I… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by TIFS 2024

  2. arXiv:2406.09121  [pdf, other

    cs.CV

    MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

    Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.20721  [pdf, other

    cs.CV cs.AI

    ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model

    Authors: Yufei Wang, Zhihao Li, Lanqing Guo, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Recently, 3D Gaussian Splatting (3DGS) has become a promising framework for novel view synthesis, offering fast rendering speeds and high fidelity. However, the large number of Gaussians and their associated attributes require effective compression techniques. Existing methods primarily compress neural Gaussians individually and independently, i.e., coding all the neural Gaussians at the same time… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  4. arXiv:2405.11852  [pdf, other

    cs.CV

    Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models

    Authors: Xiyu Wang, Yufei Wang, Satoshi Tsutsui, Weisi Lin, Bihan Wen, Alex C. Kot

    Abstract: Diffusion-based models for story visualization have shown promise in generating content-coherent images for storytelling tasks. However, how to effectively integrate new characters into existing narratives while maintaining character consistency remains an open problem, particularly with limited data. Two major limitations hinder the progress: (1) the absence of a suitable benchmark due to potenti… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  5. arXiv:2405.09487  [pdf, other

    cs.CV

    Color Space Learning for Cross-Color Person Re-Identification

    Authors: Jiahao Nie, Shan Lin, Alex C. Kot

    Abstract: The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Perso… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME 2024 (Oral)

  6. arXiv:2405.06995  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Benchmarking Cross-Domain Audio-Visual Deception Detection

    Authors: Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. Kot

    Abstract: Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior. Conventional contact-based techniques, like polygraph devices, rely on physiological signals to determine the authenticity of an individual's statements. Nevertheless, recent developments in automated deception detection have demonstrated that multimodal features d… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 10 pages

  7. arXiv:2405.01460  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders

    Authors: Yi Yu, Yufei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot

    Abstract: Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationall… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  8. arXiv:2404.13576  [pdf, other

    cs.CV cs.LG

    I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning

    Authors: Songlin Dong, Yingjie Chen, Yuhang He, Yuhan Jin, Alex C. Kot, Yihong Gong

    Abstract: Online task-free continual learning (OTFCL) is a more challenging variant of continual learning which emphasizes the gradual shift of task boundaries and learns in an online mode. Existing methods rely on a memory buffer composed of old samples to prevent forgetting. However,the use of memory buffers not only raises privacy concerns but also hinders the efficient learning of new samples. To addres… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  9. arXiv:2404.08452  [pdf, other

    cs.CV

    MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

    Authors: Chenqi Kong, Anwei Luo, Peijun Bao, Yi Yu, Haoliang Li, Zengwei Zheng, Shiqi Wang, Alex C. Kot

    Abstract: Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial comp… ▽ More

    Submitted 7 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  10. arXiv:2401.08407  [pdf, other

    cs.CV

    Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining

    Authors: Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars. In this paper, we undertake a comprehensive study of CD-FSS and uncover two crucial insights: (i) the necessity of a fine-tuning stage to effectively transfer the learned meta-knowledge across domains, and (ii) the overfitting risk during the naïve fin… ▽ More

    Submitted 13 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by CVPR 2024

  11. arXiv:2312.15490   

    cs.IR cs.AI

    Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models

    Authors: Ling Li, Shaohua Li, Winda Marantika, Alex C. Kot, Huijing Zhan

    Abstract: Denoising Diffusion Probabilistic Model (DDPM) has shown great competence in image and audio generation tasks. However, there exist few attempts to employ DDPM in the text generation, especially review generation under recommendation systems. Fueled by the predicted reviews explainability that justifies recommendations could assist users better understand the recommended items and increase the tra… ▽ More

    Submitted 18 June, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

    Comments: I request to withdraw my paper due to the discovery of significant errors in terms of experimental results in the manuscript that affect the validity of the paper. These errors are necessary to correct, and the current version should not be used or cited in its present form

  12. arXiv:2311.14760  [pdf, other

    cs.CV

    SinSR: Diffusion-Based Image Super-Resolution in a Single Step

    Authors: Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen

    Abstract: While super-resolution (SR) methods based on diffusion models exhibit promising results, their practical application is hindered by the substantial number of required inference steps. Recent methods utilize degraded images in the initial state, thereby shortening the Markov chain. Nevertheless, these solutions either rely on a precise formulation of the degradation process or still necessitate a r… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  13. arXiv:2310.00234  [pdf, other

    cs.CR cs.CV eess.IV

    Pixel-Inconsistency Modeling for Image Manipulation Localization

    Authors: Chenqi Kong, Anwei Luo, Shiqi Wang, Haoliang Li, Anderson Rocha, Alex C. Kot

    Abstract: Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity,… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  14. arXiv:2309.11092  [pdf, other

    cs.CV cs.MM

    Forgery-aware Adaptive Vision Transformer for Face Forgery Detection

    Authors: Anwei Luo, Rizhao Cai, Chenqi Kong, Xiangui Kang, Jiwu Huang, Alex C. Kot

    Abstract: With the advancement in face manipulation technologies, the importance of face forgery detection in protecting authentication integrity becomes increasingly evident. Previous Vision Transformer (ViT)-based detectors have demonstrated subpar performance in cross-database evaluations, primarily because fully fine-tuning with limited Deepfake data often leads to forgetting pre-trained knowledge and o… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  15. arXiv:2307.07710  [pdf, other

    cs.CV eess.IV

    ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: Previous raw image-based low-light image enhancement methods predominantly relied on feed-forward neural networks to learn deterministic mappings from low-light to normally-exposed images. However, they failed to capture critical distribution information, leading to visually undesirable results. This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure… ▽ More

    Submitted 15 August, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: accepted by ICCV2023

  16. arXiv:2307.07286  [pdf, other

    cs.CV cs.AI

    One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton Matching

    Authors: Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Alex C. Kot

    Abstract: One-shot skeleton action recognition, which aims to learn a skeleton action recognition model with a single training sample, has attracted increasing interest due to the challenge of collecting and annotating large-scale skeleton action data. However, most existing studies match skeleton sequences by comparing their feature vectors directly which neglects spatial structures and temporal orders of… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: 8 pages, 4 figures, 6 tables. Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  17. arXiv:2307.04122  [pdf, other

    cs.CV eess.IV

    Enhancing Low-Light Images Using Infrared-Encoded Images

    Authors: Shulin Tian, Yufei Wang, Renjie Wan, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Low-light image enhancement task is essential yet challenging as it is ill-posed intrinsically. Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss, which limits the capacity of recovering the brightness, contrast, and texture details due to the small number of income photons. In this work, we propose a novel approach to increase the visibility… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: The first two authors contribute equally. The work is accepted by ICIP 2023

  18. arXiv:2306.12058  [pdf, other

    cs.CV eess.IV

    Beyond Learned Metadata-based Raw Image Reconstruction

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: While raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels, they are not widely adopted by general users due to their substantial storage requirements. Very recent studies propose to compress raw images by designing sampling masks within the pixel space of the raw image. However, these approaches often leave space for pursuing more effective im… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  19. arXiv:2304.12489  [pdf, other

    cs.CV cs.CR

    Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery Detection

    Authors: Anwei Luo, Chenqi Kong, Jiwu Huang, Yongjian Hu, Xiangui Kang, Alex C. Kot

    Abstract: Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  20. arXiv:2304.08799  [pdf, other

    cs.CV cs.AI

    Self-Supervised 3D Action Representation Learning with Skeleton Cloud Colorization

    Authors: Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Yongjian Hu, Alex C. Kot

    Abstract: 3D Skeleton-based human action recognition has attracted increasing attention in recent years. Most of the existing work focuses on supervised learning which requires a large number of labeled action sequences that are often expensive and time-consuming to annotate. In this paper, we address self-supervised 3D action representation learning for skeleton-based action recognition. We investigate sel… ▽ More

    Submitted 16 October, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted by TPAMI. This work is an extension of our ICCV 2021 paper [arXiv:2108.01959] https://openaccess.thecvf.com/content/ICCV2021/html/Yang_Skeleton_Cloud_Colorization_for_Unsupervised_3D_Action_Representation_Learning_ICCV_2021_paper.html

  21. arXiv:2303.10452  [pdf, other

    cs.CV

    Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation

    Authors: Xiyu Wang, Yuecong Xu, Jianfei Yang, Bihan Wen, Alex C. Kot

    Abstract: Continuous Video Domain Adaptation (CVDA) is a scenario where a source model is required to adapt to a series of individually available changing target domains continuously without source data or target supervision. It has wide applications, such as robotic vision and autonomous driving. The main underlying challenge of CVDA is to learn helpful information only from the unsupervised target data wh… ▽ More

    Submitted 29 August, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

    Comments: 16 pages, 9 tables, 10 figures

  22. arXiv:2303.02057  [pdf, other

    eess.IV cs.CV

    Unsupervised Deep Digital Staining For Microscopic Cell Images Via Knowledge Distillation

    Authors: Ziwang Xu, Lanqing Guo, Shuyan Zhang, Alex C. Kot, Bihan Wen

    Abstract: Staining is critical to cell imaging and medical diagnosis, which is expensive, time-consuming, labor-intensive, and causes irreversible changes to cell tissues. Recent advances in deep learning enabled digital staining via supervised model training. However, it is difficult to obtain large-scale stained/unstained cell image pairs in practice, which need to be perfectly aligned with the supervisio… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  23. arXiv:2302.14677  [pdf, other

    cs.CV cs.CR eess.IV

    Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

    Authors: Yi Yu, Yufei Wang, Wenhan Yang, Shijian Lu, Yap-peng Tan, Alex C. Kot

    Abstract: Recent deep-learning-based compression methods have achieved superior performance compared with traditional approaches. However, deep learning models have proven to be vulnerable to backdoor attacks, where some specific trigger patterns added to the input can lead to malicious behavior of the models. In this paper, we present a novel backdoor attack with multiple triggers against learned image com… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted by CVPR 2023

    ACM Class: I.4

  24. arXiv:2302.14309  [pdf, other

    cs.CV

    Temporal Coherent Test-Time Optimization for Robust Video Classification

    Authors: Chenyu Yi, Siyuan Yang, Yufei Wang, Haoliang Li, Yap-Peng Tan, Alex C. Kot

    Abstract: Deep neural networks are likely to fail when the test data is corrupted in real-world deployment (e.g., blur, weather, etc.). Test-time optimization is an effective way that adapts models to generalize to corrupted data during testing, which has been shown in the image domain. However, the techniques for improving video classification corruption robustness remain few. In this work, we propose a Te… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  25. Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation

    Authors: Ling Li, Bandara Dissanayake, Tatsuya Omotezako, Yunjie Zhong, Qing Zhang, Rizhao Cai, Qian Zheng, Dennis Sng, Weisi Lin, Yufei Wang, Alex C Kot

    Abstract: Simulating the effects of skincare products on face is a potential new way to communicate the efficacy of skincare products in skin diagnostics and product recommendations. Furthermore, such simulations enable one to anticipate his/her skin conditions and better manage skin health. However, there is a lack of effective simulations today. In this paper, we propose the first simulation model to reve… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 6 pages, 7 figures

  26. arXiv:2302.05936  [pdf, other

    cs.CV

    Generalized Few-Shot Continual Learning with Contrastive Mixture of Adapters

    Authors: Yawen Cui, Zitong Yu, Rizhao Cai, Xun Wang, Alex C. Kot, Li Liu

    Abstract: The goal of Few-Shot Continual Learning (FSCL) is to incrementally learn novel tasks with limited labeled samples and preserve previous capabilities simultaneously, while current FSCL methods are all for the class-incremental purpose. Moreover, the evaluation of FSCL solutions is only the cumulative performance of all encountered tasks, but there is no work on exploring the domain generalization a… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: Submitted to International Journal of Computer Vision (IJCV)

  27. arXiv:2302.05746  [pdf, other

    cs.CV eess.IV

    Removing Image Artifacts From Scratched Lens Protectors

    Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Bihan Wen, Lap-Pui Chau, Alex C. Kot

    Abstract: A protector is placed in front of the camera lens for mobile devices to avoid damage, while the protector itself can be easily scratched accidentally, especially for plastic ones. The artifacts appear in a wide variety of patterns, making it difficult to see through them clearly. Removing image artifacts from the scratched lens protector is inherently challenging due to the occasional flare artifa… ▽ More

    Submitted 14 February, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted by ISCAS 2023

  28. arXiv:2209.01935  [pdf, other

    cs.CV cs.MM

    Forensicability Assessment of Questioned Images in Recapturing Detection

    Authors: Changsheng Chen, Lin Zhao, Rizhao Cai, Zitong Yu, Jiwu Huang, Alex C. Kot

    Abstract: Recapture detection of face and document images is an important forensic task. With deep learning, the performances of face anti-spoofing (FAS) and recaptured document detection have been improved significantly. However, the performances are not yet satisfactory on samples with weak forensic cues. The amount of forensic cues can be quantified to allow a reliable forensic result. In this work, we p… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

    Comments: 12 pages, 10 figures, 2 tables (Submitted to TIFS July-2022)

  29. arXiv:2208.05401  [pdf, other

    cs.CV

    Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues

    Authors: Zitong Yu, Rizhao Cai, Zhi Li, Wenhan Yang, Jingang Shi, Alex C. Kot

    Abstract: Face anti-spoofing (FAS) and face forgery detection play vital roles in securing face biometric systems from presentation attacks (PAs) and vicious digital manipulation (e.g., deepfakes). Despite promising performance upon large-scale data and powerful deep models, the generalization problem of existing approaches is still an open issue. Most of recent approaches focus on 1) unimodal visual appear… ▽ More

    Submitted 8 January, 2024; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted by IEEE Transactions on Dependable and Secure Computing (TDSC). Corresponding authors: Zitong Yu and Wenhan Yang

  30. arXiv:2207.01204  [pdf, other

    cs.CV

    Adversarial Pairwise Reverse Attention for Camera Performance Imbalance in Person Re-identification: New Dataset and Metrics

    Authors: Eugene P. W. Ang, Shan Lin, Rahul Ahuja, Nemath Ahmed, Alex C. Kot

    Abstract: Existing evaluation metrics for Person Re-Identification (Person ReID) models focus on system-wide performance. However, our studies reveal weaknesses due to the uneven data distributions among cameras and different camera properties that expose the ReID system to exploitation. In this work, we raise the long-ignored ReID problem of camera performance imbalance and collect a real-world privacy-awa… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted into the IEEE International Conference on Image Processing (ICIP) 2022

  31. arXiv:2205.03792  [pdf, other

    cs.CV

    One-Class Knowledge Distillation for Face Presentation Attack Detection

    Authors: Zhi Li, Rizhao Cai, Haoliang Li, Kwok-Yan Lam, Yongjian Hu, Alex C. Kot

    Abstract: Face presentation attack detection (PAD) has been extensively studied by research communities to enhance the security of face recognition systems. Although existing methods have achieved good performance on testing data with similar distribution as the training data, their performance degrades severely in application scenarios with data of unseen distributions. In situations where the training and… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  32. arXiv:2203.16931  [pdf, other

    cs.CV cs.CR

    Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond

    Authors: Yi Yu, Wenhan Yang, Yap-Peng Tan, Alex C. Kot

    Abstract: Rain removal aims to remove rain streaks from images/videos and reduce the disruptive effects caused by rain. It not only enhances image/video visibility but also allows many computer vision algorithms to function properly. This paper makes the first attempt to conduct a comprehensive study on the robustness of deep learning-based rain removal methods against adversarial attacks. Our study shows t… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 10 pages, 6 figures, to appear in CVPR 2022

  33. arXiv:2203.16056  [pdf, other

    cs.CV

    Automatic Facial Skin Feature Detection for Everyone

    Authors: Qian Zheng, Ankur Purwar, Heng Zhao, Guang Liang Lim, Ling Li, Debasish Behera, Qian Wang, Min Tan, Rizhao Cai, Jennifer Werner, Dennis Sng, Maurice van Steensel, Weisi Lin, Alex C Kot

    Abstract: Automatic assessment and understanding of facial skin condition have several applications, including the early detection of underlying health problems, lifestyle and dietary treatment, skin-care product recommendation, etc. Selfies in the wild serve as an excellent data resource to democratize skin quality assessment, but suffer from several data collection challenges.The key to guaranteeing an ac… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted by the conference of Electronic Imaging (EI) 2022

  34. arXiv:2110.11391  [pdf, other

    cs.CV

    DEX: Domain Embedding Expansion for Generalized Person Re-identification

    Authors: Eugene P. W. Ang, Lin Shan, Alex C. Kot

    Abstract: In recent years, supervised Person Re-identification (Person ReID) approaches have demonstrated excellent performance. However, when these methods are applied to inputs from a different camera network, they typically suffer from significant performance degradation. Different from most domain adaptation (DA) approaches addressing this issue, we focus on developing a domain generalization (DG) Perso… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted into BMVC 2021

  35. arXiv:2110.09108  [pdf, other

    cs.CV

    Asymmetric Modality Translation For Face Presentation Attack Detection

    Authors: Zhi Li, Haoliang Li, Xin Luo, Yongjian Hu, Kwok-Yan Lam, Alex C. Kot

    Abstract: Face presentation attack detection (PAD) is an essential measure to protect face recognition systems from being spoofed by malicious users and has attracted great attention from both academia and industry. Although most of the existing methods can achieve desired performance to some extent, the generalization issue of face presentation attack detection under cross-domain settings (e.g., the settin… ▽ More

    Submitted 20 October, 2021; v1 submitted 18 October, 2021; originally announced October 2021.

  36. Learning Meta Pattern for Face Anti-Spoofing

    Authors: Rizhao Cai, Zhi Li, Renjie Wan, Haoliang Li, Yongjian Hu, Alex Chichung Kot

    Abstract: Face Anti-Spoofing (FAS) is essential to secure face recognition systems and has been extensively studied in recent years. Although deep neural networks (DNNs) for the FAS task have achieved promising results in intra-dataset experiments with similar distributions of training and testing data, the DNNs' generalization ability is limited under the cross-domain scenarios with different distributions… ▽ More

    Submitted 17 May, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted by IEEE Transactions on Information Forensics and Security (https://ieeexplore.ieee.org.remotexs.ntu.edu.sg/document/9732458) Source code available in https://github.com/RizhaoCai/MetaPattern_FAS

    Journal ref: IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1201-1213, 2022

  37. arXiv:2109.12548  [pdf, other

    cs.CV

    Disentangled Feature Representation for Few-shot Image Classification

    Authors: Hao Cheng, Yufei Wang, Haoliang Li, Alex C. Kot, Bihan Wen

    Abstract: Learning the generalizable feature representation is critical for few-shot image classification. While recent works exploited task-specific feature embedding using meta-tasks for few-shot learning, they are limited in many challenging tasks as being distracted by the excursive features such as the background, domain and style of the image samples. In this work, we propose a novel Disentangled Feat… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

  38. arXiv:2109.05923  [pdf, other

    eess.IV cs.CV

    Low-Light Image Enhancement with Normalizing Flow

    Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Haoliang Li, Lap-Pui Chau, Alex C. Kot

    Abstract: To enhance low-light images to normally-exposed ones is highly ill-posed, namely that the mapping relationship between them is one-to-many. Previous works based on the pixel-wise reconstruction losses and deterministic processes fail to capture the complex conditional distribution of normally exposed images, which results in improper brightness, residual noise, and artifacts. In this paper, we inv… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

  39. arXiv:2109.05826  [pdf, other

    cs.CV

    Variational Disentanglement for Domain Generalization

    Authors: Yufei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, Alex C. Kot

    Abstract: Domain generalization aims to learn an invariant model that can generalize well to the unseen target domain. In this paper, we propose to tackle the problem of domain generalization by delivering an effective framework named Variational Disentanglement Network (VDN), which is capable of disentangling the domain-specific features and task-specific features, where the task-specific features are expe… ▽ More

    Submitted 16 May, 2023; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted to TMLR 2022

  40. arXiv:2108.01959  [pdf, other

    cs.CV

    Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning

    Authors: Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, Alex C. Kot

    Abstract: Skeleton-based human action recognition has attracted increasing attention in recent years. However, most of the existing works focus on supervised learning which requiring a large number of annotated action sequences that are often expensive to collect. We investigate unsupervised representation learning for skeleton action recognition, and design a novel skeleton cloud colorization technique tha… ▽ More

    Submitted 9 August, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: This paper is accepted by ICCV2021

  41. arXiv:2107.02629  [pdf, other

    cs.CV cs.AI

    Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

    Authors: Yufei Wang, Haoliang Li, Lap-pui Chau, Alex C. Kot

    Abstract: Though convolutional neural networks are widely used in different tasks, lack of generalization capability in the absence of sufficient and representative data is one of the challenges that hinder their practical application. In this paper, we propose a simple, effective, and plug-and-play training strategy named Knowledge Distillation for Domain Generalization (KDDG) which is built upon a knowled… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted by ACM MM, 2021

  42. arXiv:2101.10390  [pdf, other

    cs.LG q-bio.PE

    Introducing a Central African Primate Vocalisation Dataset for Automated Species Classification

    Authors: Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp, Floor Meewis, Amparo C. Koot, Heysem Kaya

    Abstract: Automated classification of animal vocalisations is a potentially powerful wildlife monitoring tool. Training robust classifiers requires sizable annotated datasets, which are not easily recorded in the wild. To circumvent this problem, we recorded four primate species under semi-natural conditions in a wildlife sanctuary in Cameroon with the objective to train a classifier capable of detecting sp… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: 5 pages, 3 figures, 2 tables

  43. Multi-Domain Adversarial Feature Generalization for Person Re-Identification

    Authors: Shan Lin, Chang-Tsun Li, Alex C. Kot

    Abstract: With the assistance of sophisticated training methods applied to single labeled datasets, the performance of fully-supervised person re-identification (Person Re-ID) has been improved significantly in recent years. However, these models trained on a single dataset usually suffer from considerable performance degradation when applied to videos of a different camera network. To make Person Re-ID sys… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: TIP (Accept with Mandatory Minor Revisions)

  44. Skeleton-based Relational Reasoning for Group Activity Analysis

    Authors: Mauricio Perez, Jun Liu, Alex C. Kot

    Abstract: Research on group activity recognition mostly leans on the standard two-stream approach (RGB and Optical Flow) as their input features. Few have explored explicit pose information, with none using it directly to reason about the persons interactions. In this paper, we leverage the skeleton information to learn the interactions between the individuals straight from it. With our proposed method GIRN… ▽ More

    Submitted 7 October, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: 26 pages, 5 figures, accepted manuscript in Elsevier Pattern Recognition, minor writing revisions and new references

  45. arXiv:2009.12829  [pdf, other

    cs.CV cs.LG eess.IV

    Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization

    Authors: Haoliang Li, YuFei Wang, Renjie Wan, Shiqi Wang, Tie-Qiang Li, Alex C. Kot

    Abstract: Recently, we have witnessed great progress in the field of medical imaging classification by adopting deep neural networks. However, the recent advanced models still require accessing sufficiently large and representative datasets for training, which is often unfeasible in clinically realistic environments. When trained on limited datasets, the deep neural network is lack of generalization capabil… ▽ More

    Submitted 29 October, 2020; v1 submitted 27 September, 2020; originally announced September 2020.

    Comments: Accepted by NeurIPS, 2020

  46. DRL-FAS: A Novel Framework Based on Deep Reinforcement Learning for Face Anti-Spoofing

    Authors: Rizhao Cai, Haoliang Li, Shiqi Wang, Changsheng Chen, Alex Chichung Kot

    Abstract: Inspired by the philosophy employed by human beings to determine whether a presented face example is genuine or not, i.e., to glance at the example globally first and then carefully observe the local regions to gain more discriminative information, for the face anti-spoofing problem, we propose a novel framework based on the Convolutional Neural Network (CNN) and the Recurrent Neural Network (RNN)… ▽ More

    Submitted 18 September, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted by IEEE Transactions on Information Forensics and Security. Code will be released soon

  47. arXiv:2009.06996  [pdf, other

    cs.CR cs.AI cs.CV

    Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems

    Authors: Haoliang Li, Yufei Wang, Xiaofei Xie, Yang Liu, Shiqi Wang, Renjie Wan, Lap-Pui Chau, Alex C. Kot

    Abstract: Deep neural networks (DNN) have shown great success in many computer vision applications. However, they are also known to be susceptible to backdoor attacks. When conducting backdoor attacks, most of the existing approaches assume that the targeted DNN is always available, and an attacker can always inject a specific pattern to the training data to further fine-tune the DNN model. However, in prac… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: First two authors contributed equally

  48. Heterogeneous Domain Generalization via Domain Mixup

    Authors: Yufei Wang, Haoliang Li, Alex C. Kot

    Abstract: One of the main drawbacks of deep Convolutional Neural Networks (DCNN) is that they lack generalization capability. In this work, we focus on the problem of heterogeneous domain generalization which aims to improve the generalization capability across different tasks, which is, how to learn a DCNN model with multiple domain data such that the trained feature extractor can be generalized to support… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

  49. arXiv:2003.07177  [pdf, other

    cs.CV

    Refinements in Motion and Appearance for Online Multi-Object Tracking

    Authors: Piao Huang, Shoudong Han, Jun Zhao, Donghaisheng Liu, Hongwei Wang, En Yu, Alex ChiChung Kot

    Abstract: Modern multi-object tracking (MOT) system usually involves separated modules, such as motion model for location and appearance model for data association. However, the compatible problems within both motion and appearance models are always ignored. In this paper, a general architecture named as MIF is presented by seamlessly blending the Motion integration, three-dimensional(3D) Integral image and… ▽ More

    Submitted 17 March, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

  50. arXiv:1910.04963  [pdf, other

    cs.CV

    Interaction Relational Network for Mutual Action Recognition

    Authors: Mauricio Perez, Jun Liu, Alex C. Kot

    Abstract: Person-person mutual action recognition (also referred to as interaction recognition) is an important research branch of human activity analysis. Current solutions in the field -- mainly dominated by CNNs, GCNs and LSTMs -- often consist of complicated architectures and mechanisms to embed the relationships between the two persons on the architecture itself, to ensure the interaction patterns can… ▽ More

    Submitted 7 January, 2021; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: 12 pages, 6 figures, to be published in IEEE TMM