subscribe to arXiv mailings

Waterfall: Framework for Robust and Scalable Text Watermarking

Authors: Gregory Kang Ruey Lau, Xinyuan Niu, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low

Abstract: Protecting intellectual property (IP) of text such as articles and code is increasingly important, especially as sophisticated attacks become possible, such as paraphrasing by large language models (LLMs) or even unauthorized training of LLMs on copyrighted text to infringe such IP. However, existing text watermarking methods are not robust enough against such attacks nor scalable to millions of u… ▽ More Protecting intellectual property (IP) of text such as articles and code is increasingly important, especially as sophisticated attacks become possible, such as paraphrasing by large language models (LLMs) or even unauthorized training of LLMs on copyrighted text to infringe such IP. However, existing text watermarking methods are not robust enough against such attacks nor scalable to millions of users for practical implementation. In this paper, we propose Waterfall, the first training-free framework for robust and scalable text watermarking applicable across multiple text types (e.g., articles, code) and languages supportable by LLMs, for general text and LLM data provenance. Waterfall comprises several key innovations, such as being the first to use LLM as paraphrasers for watermarking along with a novel combination of techniques that are surprisingly effective in achieving robust verifiability and scalability. We empirically demonstrate that Waterfall achieves significantly better scalability, robust verifiability, and computational efficiency compared to SOTA article-text watermarking methods, and also showed how it could be directly applied to the watermarking of code. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2406.04606 [pdf, other]

Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions

Authors: Jingtan Wang, Xiaoqiang Lin, Rui Qiao, Chuan-Sheng Foo, Bryan Kian Hsiang Low

Abstract: The increasing complexity of foundational models underscores the necessity for explainability, particularly for fine-tuning, the most widely used training method for adapting models to downstream tasks. Instance attribution, one type of explanation, attributes the model prediction to each training example by an instance score. However, the robustness of instance scores, specifically towards datase… ▽ More The increasing complexity of foundational models underscores the necessity for explainability, particularly for fine-tuning, the most widely used training method for adapting models to downstream tasks. Instance attribution, one type of explanation, attributes the model prediction to each training example by an instance score. However, the robustness of instance scores, specifically towards dataset resampling, has been overlooked. To bridge this gap, we propose a notion of robustness on the sign of the instance score. We theoretically and empirically demonstrate that the popular leave-one-out-based methods lack robustness, while the Shapley value behaves significantly better, but at a higher computational cost. Accordingly, we introduce an efficient fine-tuning-free approximation of the Shapley value (FreeShap) for instance attribution based on the neural tangent kernel. We empirically demonstrate that FreeShap outperforms other methods for instance attribution and other data-centric applications such as data removal, data selection, and wrong label detection, and further generalize our scale to large language models (LLMs). Our code is available at https://github.com/JTWang2000/FreeShap. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Accepted to ICML 2024

arXiv:2406.02635 [pdf, other]

Evidentially Calibrated Source-Free Time-Series Domain Adaptation with Temporal Imputation

Authors: Mohamed Ragab, Peiliang Gong, Emadeldeen Eldele, Wenyu Zhang, Min Wu, Chuan-Sheng Foo, Daoqiang Zhang, Xiaoli Li, Zhenghua Chen

Abstract: Source-free domain adaptation (SFDA) aims to adapt a model pre-trained on a labeled source domain to an unlabeled target domain without access to source data, preserving the source domain's privacy. While SFDA is prevalent in computer vision, it remains largely unexplored in time series analysis. Existing SFDA methods, designed for visual data, struggle to capture the inherent temporal dynamics of… ▽ More Source-free domain adaptation (SFDA) aims to adapt a model pre-trained on a labeled source domain to an unlabeled target domain without access to source data, preserving the source domain's privacy. While SFDA is prevalent in computer vision, it remains largely unexplored in time series analysis. Existing SFDA methods, designed for visual data, struggle to capture the inherent temporal dynamics of time series, hindering adaptation performance. This paper proposes MAsk And imPUte (MAPU), a novel and effective approach for time series SFDA. MAPU addresses the critical challenge of temporal consistency by introducing a novel temporal imputation task. This task involves randomly masking time series signals and leveraging a dedicated temporal imputer to recover the original signal within the learned embedding space, bypassing the complexities of noisy raw data. Notably, MAPU is the first method to explicitly address temporal consistency in the context of time series SFDA. Additionally, it offers seamless integration with existing SFDA methods, providing greater flexibility. We further introduce E-MAPU, which incorporates evidential uncertainty estimation to address the overconfidence issue inherent in softmax predictions. To achieve that, we leverage evidential deep learning to obtain a better-calibrated pre-trained model and adapt the target encoder to map out-of-support target samples to a new feature representation closer to the source domain's support. This fosters better alignment, ultimately enhancing adaptation performance. Extensive experiments on five real-world time series datasets demonstrate that both MAPU and E-MAPU achieve significant performance gains compared to existing methods. These results highlight the effectiveness of our proposed approaches for tackling various time series domain adaptation problems. △ Less

Submitted 12 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.02954 [pdf, other]

Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training

Authors: Wenyu Zhang, Li Shen, Chuan-Sheng Foo

Abstract: Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to a related but unlabeled target domain. While the source model is a key avenue for acquiring target pseudolabels, the generated pseudolabels may exhibit source bias. In the conventional SFDA pipeline, a large data (e.g. ImageNet) pre-trained feature extractor is used to initialize the sourc… ▽ More Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to a related but unlabeled target domain. While the source model is a key avenue for acquiring target pseudolabels, the generated pseudolabels may exhibit source bias. In the conventional SFDA pipeline, a large data (e.g. ImageNet) pre-trained feature extractor is used to initialize the source model at the start of source training, and subsequently discarded. Despite having diverse features important for generalization, the pre-trained feature extractor can overfit to the source data distribution during source training and forget relevant target domain knowledge. Rather than discarding this valuable knowledge, we introduce an integrated framework to incorporate pre-trained networks into the target adaptation process. The proposed framework is flexible and allows us to plug modern pre-trained networks into the adaptation process to leverage their stronger representation learning capabilities. For adaptation, we propose the Co-learn algorithm to improve target pseudolabel quality collaboratively through the source model and a pre-trained feature extractor. Building on the recent success of the vision-language model CLIP in zero-shot image recognition, we present an extension Co-learn++ to further incorporate CLIP's zero-shot classification decisions. We evaluate on 3 benchmark datasets and include more challenging scenarios such as open-set, partial-set and open-partial SFDA. Experimental results demonstrate that our proposed strategy improves adaptation performance and can be successfully integrated with existing SFDA methods. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: Extension of ICCV paper arXiv:2212.07585, submitted to IJCV

arXiv:2404.11151 [pdf, other]

REACTO: Reconstructing Articulated Objects from a Single Video

Authors: Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

Abstract: In this paper, we address the challenge of reconstructing general articulated 3D objects from a single video. Existing works employing dynamic neural radiance fields have advanced the modeling of articulated objects like humans and animals from videos, but face challenges with piece-wise rigid general articulated objects due to limitations in their deformation models. To tackle this, we propose Qu… ▽ More In this paper, we address the challenge of reconstructing general articulated 3D objects from a single video. Existing works employing dynamic neural radiance fields have advanced the modeling of articulated objects like humans and animals from videos, but face challenges with piece-wise rigid general articulated objects due to limitations in their deformation models. To tackle this, we propose Quasi-Rigid Blend Skinning, a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints. Our primary insight combines three distinct approaches: 1) an enhanced bone rigging system for improved component modeling, 2) the use of quasi-sparse skinning weights to boost part rigidity and reconstruction fidelity, and 3) the application of geodesic point assignment for precise motion and seamless deformation. Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects, as demonstrated on both real and synthetic datasets. Project page: https://chaoyuesong.github.io/REACTO. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.18423 [pdf, other]

SemRoDe: Macro Adversarial Training to Learn Representations That are Robust to Word-Level Attacks

Authors: Brian Formento, Wenjie Feng, Chuan Sheng Foo, Luu Anh Tuan, See-Kiong Ng

Abstract: Language models (LMs) are indispensable tools for natural language processing tasks, but their vulnerability to adversarial attacks remains a concern. While current research has explored adversarial training techniques, their improvements to defend against word-level attacks have been limited. In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial T… ▽ More Language models (LMs) are indispensable tools for natural language processing tasks, but their vulnerability to adversarial attacks remains a concern. While current research has explored adversarial training techniques, their improvements to defend against word-level attacks have been limited. In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. Drawing inspiration from recent studies in the image domain, we investigate and later confirm that in a discrete data setting such as language, adversarial samples generated via word substitutions do indeed belong to an adversarial domain exhibiting a high Wasserstein distance from the base domain. Our method learns a robust representation that bridges these two domains. We hypothesize that if samples were not projected into an adversarial domain, but instead to a domain with minimal shift, it would improve attack robustness. We align the domains by incorporating a new distance-based objective. With this, our model is able to learn more generalized representations by aligning the model's high-level output features and therefore better handling unseen adversarial samples. This method can be generalized across word embeddings, even when they share minimal overlap at both vocabulary and word-substitution levels. To evaluate the effectiveness of our approach, we conduct experiments on BERT and RoBERTa models on three datasets. The results demonstrate promising state-of-the-art robustness. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Published in NAACL 2024 (Main Track)

arXiv:2403.11234 [pdf, other]

Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias

Authors: Wenyu Zhang, Qingmu Liu, Felix Ong Wei Cong, Mohamed Ragab, Chuan-Sheng Foo

Abstract: Domain adaptation is a critical task in machine learning that aims to improve model performance on a target domain by leveraging knowledge from a related source domain. In this work, we introduce Universal Semi-Supervised Domain Adaptation (UniSSDA), a practical yet challenging setting where the target domain is partially labeled, and the source and target label space may not strictly match. UniSS… ▽ More Domain adaptation is a critical task in machine learning that aims to improve model performance on a target domain by leveraging knowledge from a related source domain. In this work, we introduce Universal Semi-Supervised Domain Adaptation (UniSSDA), a practical yet challenging setting where the target domain is partially labeled, and the source and target label space may not strictly match. UniSSDA is at the intersection of Universal Domain Adaptation (UniDA) and Semi-Supervised Domain Adaptation (SSDA): the UniDA setting does not allow for fine-grained categorization of target private classes not represented in the source domain, while SSDA focuses on the restricted closed-set setting where source and target label spaces match exactly. Existing UniDA and SSDA methods are susceptible to common-class bias in UniSSDA settings, where models overfit to data distributions of classes common to both domains at the expense of private classes. We propose a new prior-guided pseudo-label refinement strategy to reduce the reinforcement of common-class bias due to pseudo-labeling, a common label propagation strategy in domain adaptation. We demonstrate the effectiveness of the proposed strategy on benchmark datasets Office-Home, DomainNet, and VisDA. The proposed strategy attains the best performance across UniSSDA adaptation settings and establishes a new baseline for UniSSDA. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted to CVPR 2024

arXiv:2403.09140 [pdf, other]

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

Authors: Cheng Chen, Xiaofeng Yang, Fan Yang, Chengzeng Feng, Zhoujie Fu, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

Abstract: Recent works on text-to-3d generation show that using only 2D diffusion supervision for 3D generation tends to produce results with inconsistent appearances (e.g., faces on the back view) and inaccurate shapes (e.g., animals with extra legs). Existing methods mainly address this issue by retraining diffusion models with images rendered from 3D data to ensure multi-view consistency while struggling… ▽ More Recent works on text-to-3d generation show that using only 2D diffusion supervision for 3D generation tends to produce results with inconsistent appearances (e.g., faces on the back view) and inaccurate shapes (e.g., animals with extra legs). Existing methods mainly address this issue by retraining diffusion models with images rendered from 3D data to ensure multi-view consistency while struggling to balance 2D generation quality with 3D consistency. In this paper, we present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model. Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach. Moreover, to ensure accurate appearances of different views, we further modulate the output of the 2D diffusion model to the correct patterns of the template views without altering the generated object's style. These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model. Extensive experiments show our method can largely improve the multi-view consistency while retaining fidelity and diversity. Our project page is available at: https://stellarcheng.github.io/Sculpt3D/. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024. Project Page: https://stellarcheng.github.io/Sculpt3D/

arXiv:2402.19197 [pdf, other]

doi 10.1609/aaai.v38i2.27856

Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction

Authors: Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin

Abstract: Pixel-aligned implicit models, such as PIFu, PIFuHD, and ICON, are used for single-view clothed human reconstruction. These models need to be trained using a sampling training scheme. Existing sampling training schemes either fail to capture thin surfaces (e.g. ears, fingers) or cause noisy artefacts in reconstructed meshes. To address these problems, we introduce Fine Structured-Aware Sampling (F… ▽ More Pixel-aligned implicit models, such as PIFu, PIFuHD, and ICON, are used for single-view clothed human reconstruction. These models need to be trained using a sampling training scheme. Existing sampling training schemes either fail to capture thin surfaces (e.g. ears, fingers) or cause noisy artefacts in reconstructed meshes. To address these problems, we introduce Fine Structured-Aware Sampling (FSS), a new sampling training scheme to train pixel-aligned implicit models for single-view human reconstruction. FSS resolves the aforementioned problems by proactively adapting to the thickness and complexity of surfaces. In addition, unlike existing sampling training schemes, FSS shows how normals of sample points can be capitalized in the training process to improve results. Lastly, to further improve the training process, FSS proposes a mesh thickness loss signal for pixel-aligned implicit models. It becomes computationally feasible to introduce this loss once a slight reworking of the pixel-aligned implicit function framework is carried out. Our results show that our methods significantly outperform SOTA methods qualitatively and quantitatively. Our code is publicly available at https://github.com/kcyt/FSS. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: Accepted in Proceedings of the AAAI Conference on Artificial Intelligence, 2024 (AAAI 2024)

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2024, pp. 964-971

arXiv:2402.18998 [pdf, other]

COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection

Authors: Jingyi Liao, Xun Xu, Manh Cuong Nguyen, Adam Goodge, Chuan Sheng Foo

Abstract: Existing approaches towards anomaly detection~(AD) often rely on a substantial amount of anomaly-free data to train representation and density models. However, large anomaly-free datasets may not always be available before the inference stage; in which case an anomaly detection model must be trained with only a handful of normal samples, a.k.a. few-shot anomaly detection (FSAD). In this paper, we… ▽ More Existing approaches towards anomaly detection~(AD) often rely on a substantial amount of anomaly-free data to train representation and density models. However, large anomaly-free datasets may not always be available before the inference stage; in which case an anomaly detection model must be trained with only a handful of normal samples, a.k.a. few-shot anomaly detection (FSAD). In this paper, we propose a novel methodology to address the challenge of FSAD which incorporates two important techniques. Firstly, we employ a model pre-trained on a large source dataset to initialize model weights. Secondly, to ameliorate the covariate shift between source and target domains, we adopt contrastive training to fine-tune on the few-shot target domain data. To learn suitable representations for the downstream AD task, we additionally incorporate cross-instance positive pairs to encourage a tight cluster of the normal samples, and negative pairs for better separation between normal and synthesized negative samples. We evaluate few-shot anomaly detection on on 3 controlled AD tasks and 4 real-world AD tasks to demonstrate the effectiveness of the proposed method. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: IEEE Transactions on Image Processing

arXiv:2310.00646 [pdf, other]

WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data

Authors: Jingtan Wang, Xinyang Lu, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low

Abstract: The impressive performances of large language models (LLMs) and their immense potential for commercialization have given rise to serious concerns over the intellectual property (IP) of their training data. In particular, the synthetic texts generated by LLMs may infringe the IP of the data being used to train the LLMs. To this end, it is imperative to be able to (a) identify the data provider who… ▽ More The impressive performances of large language models (LLMs) and their immense potential for commercialization have given rise to serious concerns over the intellectual property (IP) of their training data. In particular, the synthetic texts generated by LLMs may infringe the IP of the data being used to train the LLMs. To this end, it is imperative to be able to (a) identify the data provider who contributed to the generation of a synthetic text by an LLM (source attribution) and (b) verify whether the text data from a data provider has been used to train an LLM (data provenance). In this paper, we show that both problems can be solved by watermarking, i.e., by enabling an LLM to generate synthetic texts with embedded watermarks that contain information about their source(s). We identify the key properties of such watermarking frameworks (e.g., source attribution accuracy, robustness against adversaries), and propose a WAtermarking for Source Attribution (WASA) framework that satisfies these key properties due to our algorithmic designs. Our WASA framework enables an LLM to learn an accurate mapping from the texts of different data providers to their corresponding unique watermarks, which sets the foundation for effective source attribution (and hence data provenance). Extensive empirical evaluations show that our WASA framework achieves effective source attribution and data provenance. △ Less

Submitted 1 October, 2023; originally announced October 2023.

arXiv:2308.02488 [pdf, other]

Recovering non-Maxwellian particle velocity distribution functions from collective Thomson-scattered spectra

Authors: Bryan C. Foo, Derek B. Schaeffer, Peter V. Heuer

Abstract: Collective optical Thomson scattering (TS) is a diagnostic commonly used to characterize plasma parameters. These parameters are typically extracted by a fitting algorithm that minimizes the difference between a measured scattered spectrum and an analytic spectrum calculated from the velocity distribution function (VDF) of the plasma. However, most existing TS analysis algorithms assume the VDFs a… ▽ More Collective optical Thomson scattering (TS) is a diagnostic commonly used to characterize plasma parameters. These parameters are typically extracted by a fitting algorithm that minimizes the difference between a measured scattered spectrum and an analytic spectrum calculated from the velocity distribution function (VDF) of the plasma. However, most existing TS analysis algorithms assume the VDFs are Maxwellian, and applying an algorithm which makes this assumption does not accurately extract the plasma parameters of a non-Maxwellian plasma due to the effect of non-Maxwellian deviations on the TS spectra. We present new open-source numerical tools for forward modeling analytic spectra from arbitrary VDFs, and show that these tools are able to more accurately extract plasma parameters from synthetic TS spectra generated by non-Maxwellian VDFs compared to standard TS algorithms. Estimated posterior probability distributions of fits to synthetic spectra for a variety of example non-Maxwellian VDFs are used to determine uncertainties in the extracted plasma parameters, and show that correlations between parameters can significantly affect the accuracy of fits in plasmas with non-Maxwellian VDFs. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2307.07542 [pdf, other]

Source-Free Domain Adaptation with Temporal Imputation for Time Series Data

Authors: Mohamed Ragab, Emadeldeen Eldele, Min Wu, Chuan-Sheng Foo, Xiaoli Li, Zhenghua Chen

Abstract: Source-free domain adaptation (SFDA) aims to adapt a pretrained model from a labeled source domain to an unlabeled target domain without access to the source domain data, preserving source domain privacy. Despite its prevalence in visual applications, SFDA is largely unexplored in time series applications. The existing SFDA methods that are mainly designed for visual applications may fail to handl… ▽ More Source-free domain adaptation (SFDA) aims to adapt a pretrained model from a labeled source domain to an unlabeled target domain without access to the source domain data, preserving source domain privacy. Despite its prevalence in visual applications, SFDA is largely unexplored in time series applications. The existing SFDA methods that are mainly designed for visual applications may fail to handle the temporal dynamics in time series, leading to impaired adaptation performance. To address this challenge, this paper presents a simple yet effective approach for source-free domain adaptation on time series data, namely MAsk and imPUte (MAPU). First, to capture temporal information of the source domain, our method performs random masking on the time series signals while leveraging a novel temporal imputer to recover the original signal from a masked version in the embedding space. Second, in the adaptation step, the imputer network is leveraged to guide the target model to produce target features that are temporally consistent with the source features. To this end, our MAPU can explicitly account for temporal dependency during the adaptation while avoiding the imputation in the noisy input space. Our method is the first to handle temporal consistency in SFDA for time series data and can be seamlessly equipped with other existing SFDA methods. Extensive experiments conducted on three real-world time series datasets demonstrate that our MAPU achieves significant performance gain over existing methods. Our code is available at \url{https://github.com/mohamedr002/MAPU_SFDA_TS}. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: Accepted in KDD'23

arXiv:2307.07489 [pdf, other]

PseudoCal: A Source-Free Approach to Unsupervised Uncertainty Calibration in Domain Adaptation

Authors: Dapeng Hu, Jian Liang, Xinchao Wang, Chuan-Sheng Foo

Abstract: Unsupervised domain adaptation (UDA) has witnessed remarkable advancements in improving the accuracy of models for unlabeled target domains. However, the calibration of predictive uncertainty in the target domain, a crucial aspect of the safe deployment of UDA models, has received limited attention. The conventional in-domain calibration method, \textit{temperature scaling} (TempScal), encounters… ▽ More Unsupervised domain adaptation (UDA) has witnessed remarkable advancements in improving the accuracy of models for unlabeled target domains. However, the calibration of predictive uncertainty in the target domain, a crucial aspect of the safe deployment of UDA models, has received limited attention. The conventional in-domain calibration method, \textit{temperature scaling} (TempScal), encounters challenges due to domain distribution shifts and the absence of labeled target domain data. Recent approaches have employed importance-weighting techniques to estimate the target-optimal temperature based on re-weighted labeled source data. Nonetheless, these methods require source data and suffer from unreliable density estimates under severe domain shifts, rendering them unsuitable for source-free UDA settings. To overcome these limitations, we propose PseudoCal, a source-free calibration method that exclusively relies on unlabeled target data. Unlike previous approaches that treat UDA calibration as a \textit{covariate shift} problem, we consider it as an unsupervised calibration problem specific to the target domain. Motivated by the factorization of the negative log-likelihood (NLL) objective in TempScal, we generate a labeled pseudo-target set that captures the structure of the real target. By doing so, we transform the unsupervised calibration problem into a supervised one, enabling us to effectively address it using widely-used in-domain methods like TempScal. Finally, we thoroughly evaluate the calibration performance of PseudoCal by conducting extensive experiments on 10 UDA methods, considering both traditional UDA settings and recent source-free UDA scenarios. The experimental results consistently demonstrate the superior performance of PseudoCal, exhibiting significantly reduced calibration error compared to existing calibration methods. △ Less

Submitted 14 July, 2023; originally announced July 2023.

arXiv:2306.05764 [pdf, other]

Fair yet Asymptotically Equal Collaborative Learning

Authors: Xiaoqiang Lin, Xinyi Xu, See-Kiong Ng, Chuan-Sheng Foo, Bryan Kian Hsiang Low

Abstract: In collaborative learning with streaming data, nodes (e.g., organizations) jointly and continuously learn a machine learning (ML) model by sharing the latest model updates computed from their latest streaming data. For the more resourceful nodes to be willing to share their model updates, they need to be fairly incentivized. This paper explores an incentive design that guarantees fairness so that… ▽ More In collaborative learning with streaming data, nodes (e.g., organizations) jointly and continuously learn a machine learning (ML) model by sharing the latest model updates computed from their latest streaming data. For the more resourceful nodes to be willing to share their model updates, they need to be fairly incentivized. This paper explores an incentive design that guarantees fairness so that nodes receive rewards commensurate to their contributions. Our approach leverages an explore-then-exploit formulation to estimate the nodes' contributions (i.e., exploration) for realizing our theoretically guaranteed fair incentives (i.e., exploitation). However, we observe a "rich get richer" phenomenon arising from the existing approaches to guarantee fairness and it discourages the participation of the less resourceful nodes. To remedy this, we additionally preserve asymptotic equality, i.e., less resourceful nodes achieve equal performance eventually to the more resourceful/"rich" nodes. We empirically demonstrate in two settings with real-world streaming data: federated online incremental learning and federated reinforcement learning, that our proposed approach outperforms existing baselines in fairness and learning performance while remaining competitive in preserving equality. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: Accepted to 40th International Conference on Machine Learning (ICML 2023), 37 pages

arXiv:2305.18080 [pdf, other]

doi 10.1103/PhysRevResearch.6.013165

Extended magic phase in twisted graphene multilayers

Authors: D. C. W. Foo, Z. Zhan, Mohammed M. Al Ezzi, L. Peng, S. Adam, F. Guinea

Abstract: Theoretical and experimental studies have verified the existence of ``magic angles'' in twisted bilayer graphene, where the twist between layers gives rise to flat bands and consequently highly correlated phases. Narrow bands can also exist in multilayers with alternating twist angles, and recent theoretical work suggests that they can also be found in trilayers with twist angles between neighbori… ▽ More Theoretical and experimental studies have verified the existence of ``magic angles'' in twisted bilayer graphene, where the twist between layers gives rise to flat bands and consequently highly correlated phases. Narrow bands can also exist in multilayers with alternating twist angles, and recent theoretical work suggests that they can also be found in trilayers with twist angles between neighboring layers in the same direction. We show here that flat bands exist in a variety of multilayers where the ratio between twist angles is close to coprime integers. We generalize previous analyses, and, using the chiral limit for interlayer coupling, give examples of many combinations of twist angles in stacks made up of three and four layers which lead to flat bands. The technique we use can be extended to systems with many layers. Our results suggest that flat bands can exist in graphene multilayers with angle disorder, that is, narrow samples of turbostatic graphite. △ Less

Submitted 7 June, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: Main text: 4 pages, 4 figures -- Supplementary Material: 15 pages, 13 figures

Journal ref: Phys. Rev. Research 6, 013165 (2024)

arXiv:2304.08279 [pdf, other]

MoDA: Modeling Deformable 3D Objects from Casual Videos

Authors: Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan Sheng Foo, Fayao Liu, Guosheng Lin

Abstract: In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of neural radiance fields (NeRF), many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonica… ▽ More In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of neural radiance fields (NeRF), many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects. Extensive experiments on real and synthetic datasets show that our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods. Project page: \url{https://chaoyuesong.github.io/MoDA}. △ Less

Submitted 19 June, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

arXiv:2303.13724 [pdf, other]

Harmonizing Base and Novel Classes: A Class-Contrastive Approach for Generalized Few-Shot Segmentation

Authors: Weide Liu, Zhonghua Wu, Yang Zhao, Yuming Fang, Chuan-Sheng Foo, Jun Cheng, Guosheng Lin

Abstract: Current methods for few-shot segmentation (FSSeg) have mainly focused on improving the performance of novel classes while neglecting the performance of base classes. To overcome this limitation, the task of generalized few-shot semantic segmentation (GFSSeg) has been introduced, aiming to predict segmentation masks for both base and novel classes. However, the current prototype-based methods do no… ▽ More Current methods for few-shot segmentation (FSSeg) have mainly focused on improving the performance of novel classes while neglecting the performance of base classes. To overcome this limitation, the task of generalized few-shot semantic segmentation (GFSSeg) has been introduced, aiming to predict segmentation masks for both base and novel classes. However, the current prototype-based methods do not explicitly consider the relationship between base and novel classes when updating prototypes, leading to a limited performance in identifying true categories. To address this challenge, we propose a class contrastive loss and a class relationship loss to regulate prototype updates and encourage a large distance between prototypes from different classes, thus distinguishing the classes from each other while maintaining the performance of the base classes. Our proposed approach achieves new state-of-the-art performance for the generalized few-shot segmentation task on PASCAL VOC and MS COCO datasets. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2301.13514 [pdf, other]

Fourier Sensitivity and Regularization of Computer Vision Models

Authors: Kiran Krishnamachari, See-Kiong Ng, Chuan-Sheng Foo

Abstract: Recent work has empirically shown that deep neural networks latch on to the Fourier statistics of training data and show increased sensitivity to Fourier-basis directions in the input. Understanding and modifying this Fourier-sensitivity of computer vision models may help improve their robustness. Hence, in this paper we study the frequency sensitivity characteristics of deep neural networks using… ▽ More Recent work has empirically shown that deep neural networks latch on to the Fourier statistics of training data and show increased sensitivity to Fourier-basis directions in the input. Understanding and modifying this Fourier-sensitivity of computer vision models may help improve their robustness. Hence, in this paper we study the frequency sensitivity characteristics of deep neural networks using a principled approach. We first propose a basis trick, proving that unitary transformations of the input-gradient of a function can be used to compute its gradient in the basis induced by the transformation. Using this result, we propose a general measure of any differentiable model's Fourier-sensitivity using the unitary Fourier-transform of its input-gradient. When applied to deep neural networks, we find that computer vision models are consistently sensitive to particular frequencies dependent on the dataset, training method and architecture. Based on this measure, we further propose a Fourier-regularization framework to modify the Fourier-sensitivities and frequency bias of models. Using our proposed regularizer-family, we demonstrate that deep neural networks obtain improved classification accuracy on robustness evaluations. △ Less

Submitted 31 January, 2023; originally announced January 2023.

Comments: Published in TMLR, https://openreview.net/forum?id=VmTYgjYloM

Journal ref: TMLR 2022

arXiv:2212.07585 [pdf, other]

Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation

Authors: Wenyu Zhang, Li Shen, Chuan-Sheng Foo

Abstract: Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to an unlabeled target domain. Large-data pre-trained networks are used to initialize source models during source training, and subsequently discarded. However, source training can cause the model to overfit to source data distribution and lose applicable target domain knowledge. We propose t… ▽ More Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to an unlabeled target domain. Large-data pre-trained networks are used to initialize source models during source training, and subsequently discarded. However, source training can cause the model to overfit to source data distribution and lose applicable target domain knowledge. We propose to integrate the pre-trained network into the target adaptation process as it has diversified features important for generalization and provides an alternate view of features and classification decisions different from the source model. We propose to distil useful target domain information through a co-learning strategy to improve target pseudolabel quality for finetuning the source model. Evaluation on 4 benchmark datasets show that our proposed strategy improves adaptation performance and can be successfully integrated with existing SFDA methods. Leveraging modern pre-trained networks that have stronger representation learning ability in the co-learning strategy further boosts performance. △ Less

Submitted 25 August, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

Comments: Accepted to ICCV 2023

arXiv:2212.04372 [pdf]

DECO2 An Open-source Energy System Decarbonisation Planning Software Including Negative Emissions Technologies

Authors: Purusothmn Nair S. Bhasker Nair, Raymond R. Tan, Dominic C. Y. Foo, Disni Gamaralalage, Michael Short

Abstract: The deployment of CO2 capture and storage (CCS) and negative emissions technologies (NETs) are crucial to meet the net-zero target by year 2050, as emphasised by the Glasgow Climate Pact. Over the years, several energy planning models have been developed to address the temporal aspects of carbon management. However, limited works have incorporated CCS and NETs for bottom-up energy planning at the… ▽ More The deployment of CO2 capture and storage (CCS) and negative emissions technologies (NETs) are crucial to meet the net-zero target by year 2050, as emphasised by the Glasgow Climate Pact. Over the years, several energy planning models have been developed to address the temporal aspects of carbon management. However, limited works have incorporated CCS and NETs for bottom-up energy planning at the individual plant scale, which is considered in this work. The novel formulation is implemented in an open-source energy system software that has been developed in this work for optimal decarbonisation planning. The DECarbonation Options Optimisation (DECO2) software considers multiperiod energy planning with a superstructural model and was developed in Python with an integrated user interface in Microsoft Excel. The software application is demonstrated with two scenarios that differ in terms of the availabilities of mitigation technologies. Results demonstrated the potential of fuel substitutions for low-carbon alternatives in existing coal and natural gas power plants. Additionally, once NETs are mature and are available for commercial deployment, their deployment is crucial in aiding CO2 removal in minimal investment costs scenarios. Overall, the newly developed open-source software demonstrates the importance of determining the optimal deployment of mitigation technologies in meeting climate change targets for each period. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.00630 [pdf, other]

Probably Approximate Shapley Fairness with Applications in Machine Learning

Authors: Zijian Zhou, Xinyi Xu, Rachael Hwee Ling Sim, Chuan Sheng Foo, Kian Hsiang Low

Abstract: The Shapley value (SV) is adopted in various scenarios in machine learning (ML), including data valuation, agent valuation, and feature attribution, as it satisfies their fairness requirements. However, as exact SVs are infeasible to compute in practice, SV estimates are approximated instead. This approximation step raises an important question: do the SV estimates preserve the fairness guarantees… ▽ More The Shapley value (SV) is adopted in various scenarios in machine learning (ML), including data valuation, agent valuation, and feature attribution, as it satisfies their fairness requirements. However, as exact SVs are infeasible to compute in practice, SV estimates are approximated instead. This approximation step raises an important question: do the SV estimates preserve the fairness guarantees of exact SVs? We observe that the fairness guarantees of exact SVs are too restrictive for SV estimates. Thus, we generalise Shapley fairness to probably approximate Shapley fairness and propose fidelity score, a metric to measure the variation of SV estimates, that determines how probable the fairness guarantees hold. Our last theoretical contribution is a novel greedy active estimation (GAE) algorithm that will maximise the lowest fidelity score and achieve a better fairness guarantee than the de facto Monte-Carlo estimation. We empirically verify GAE outperforms several existing methods in guaranteeing fairness while remaining competitive in estimation accuracy in various ML scenarios using real-world datasets. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: 37th AAAI Conference on Artificial Intelligence (AAAI 2023)

arXiv:2209.14624 [pdf, other]

Is Complexity Required for Neural Network Pruning? A Case Study on Global Magnitude Pruning

Authors: Manas Gupta, Efe Camci, Vishandi Rudy Keneta, Abhishek Vaidyanathan, Ritwik Kanodia, Chuan-Sheng Foo, Wu Min, Lin Jie

Abstract: Pruning neural networks has become popular in the last decade when it was shown that a large number of weights can be safely removed from modern neural networks without compromising accuracy. Numerous pruning methods have been proposed since, each claiming to be better than prior art, however, at the cost of increasingly complex pruning methodologies. These methodologies include utilizing importan… ▽ More Pruning neural networks has become popular in the last decade when it was shown that a large number of weights can be safely removed from modern neural networks without compromising accuracy. Numerous pruning methods have been proposed since, each claiming to be better than prior art, however, at the cost of increasingly complex pruning methodologies. These methodologies include utilizing importance scores, getting feedback through back-propagation or having heuristics-based pruning rules amongst others. In this work, we question whether this pattern of introducing complexity is really necessary to achieve better pruning results. We benchmark these SOTA techniques against a simple pruning baseline, namely, Global Magnitude Pruning (Global MP), that ranks weights in order of their magnitudes and prunes the smallest ones. Surprisingly, we find that vanilla Global MP performs very well against the SOTA techniques. When considering sparsity-accuracy trade-off, Global MP performs better than all SOTA techniques at all sparsity ratios. When considering FLOPs-accuracy trade-off, some SOTA techniques outperform Global MP at lower sparsity ratios, however, Global MP starts performing well at high sparsity ratios and performs very well at extremely high sparsity ratios. Moreover, we find that a common issue that many pruning algorithms run into at high sparsity rates, namely, layer-collapse, can be easily fixed in Global MP. We explore why layer collapse occurs in networks and how it can be mitigated in Global MP by utilizing a technique called Minimum Threshold. We showcase the above findings on various models (WRN-28-8, ResNet-32, ResNet-50, MobileNet-V1 and FastGRNN) and multiple datasets (CIFAR-10, ImageNet and HAR-2). Code is available at https://github.com/manasgupta-1/GlobalMP. △ Less

Submitted 7 January, 2024; v1 submitted 29 September, 2022; originally announced September 2022.

arXiv:2206.07876 [pdf, other]

Domain Generalization via Selective Consistency Regularization for Time Series Classification

Authors: Wenyu Zhang, Mohamed Ragab, Chuan-Sheng Foo

Abstract: Domain generalization methods aim to learn models robust to domain shift with data from a limited number of source domains and without access to target domain samples during training. Popular domain alignment methods for domain generalization seek to extract domain-invariant features by minimizing the discrepancy between feature distributions across all domains, disregarding inter-domain relations… ▽ More Domain generalization methods aim to learn models robust to domain shift with data from a limited number of source domains and without access to target domain samples during training. Popular domain alignment methods for domain generalization seek to extract domain-invariant features by minimizing the discrepancy between feature distributions across all domains, disregarding inter-domain relationships. In this paper, we instead propose a novel representation learning methodology that selectively enforces prediction consistency between source domains estimated to be closely-related. Specifically, we hypothesize that domains share different class-informative representations, so instead of aligning all domains which can cause negative transfer, we only regularize the discrepancy between closely-related domains. We apply our method to time-series classification tasks and conduct comprehensive experiments on three public real-world datasets. Our method significantly improves over the baseline and achieves better or competitive performance in comparison with state-of-the-art methods in terms of both accuracy and model calibration. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: Accepted to ICPR 2022

arXiv:2205.15234 [pdf, other]

Few-Shot Adaptation of Pre-Trained Networks for Domain Shift

Authors: Wenyu Zhang, Li Shen, Wanyue Zhang, Chuan-Sheng Foo

Abstract: Deep networks are prone to performance degradation when there is a domain shift between the source (training) data and target (test) data. Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation. Although such methods can adapt on-the-fly without first collect… ▽ More Deep networks are prone to performance degradation when there is a domain shift between the source (training) data and target (test) data. Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation. Although such methods can adapt on-the-fly without first collecting a large target domain dataset, their performance is dependent on streaming conditions such as mini-batch size and class-distribution, which can be unpredictable in practice. In this work, we propose a framework for few-shot domain adaptation to address the practical challenges of data-efficient adaptation. Specifically, we propose a constrained optimization of feature normalization statistics in pre-trained source models supervised by a small support set from the target domain. Our method is easy to implement and improves source model performance with as few as one sample per class for classification tasks. Extensive experiments on 5 cross-domain classification and 4 semantic segmentation datasets show that our method achieves more accurate and reliable performance than test-time adaptation, while not being constrained by streaming conditions. △ Less

Submitted 22 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: Accepted to IJCAI 2022

arXiv:2205.08706 [pdf, other]

doi 10.1109/TIP.2022.3189823

SemiCurv: Semi-Supervised Curvilinear Structure Segmentation

Authors: Xun Xu, Manh Cuong Nguyen, Yasin Yazici, Kangkang Lu, Hlaing Min, Chuan-Sheng Foo

Abstract: Recent work on curvilinear structure segmentation has mostly focused on backbone network design and loss engineering. The challenge of collecting labelled data, an expensive and labor intensive process, has been overlooked. While labelled data is expensive to obtain, unlabelled data is often readily available. In this work, we propose SemiCurv, a semi-supervised learning (SSL) framework for curvil… ▽ More Recent work on curvilinear structure segmentation has mostly focused on backbone network design and loss engineering. The challenge of collecting labelled data, an expensive and labor intensive process, has been overlooked. While labelled data is expensive to obtain, unlabelled data is often readily available. In this work, we propose SemiCurv, a semi-supervised learning (SSL) framework for curvilinear structure segmentation that is able to utilize such unlabelled data to reduce the labelling burden. Our framework addresses two key challenges in formulating curvilinear segmentation in a semi-supervised manner. First, to fully exploit the power of consistency based SSL, we introduce a geometric transformation as strong data augmentation and then align segmentation predictions via a differentiable inverse transformation to enable the computation of pixel-wise consistency. Second, the traditional mean square error (MSE) on unlabelled data is prone to collapsed predictions and this issue exacerbates with severe class imbalance (significantly more background pixels). We propose a N-pair consistency loss to avoid trivial predictions on unlabelled data. We evaluate SemiCurv on six curvilinear segmentation datasets, and find that with no more than 5% of the labelled data, it achieves close to 95% of the performance relative to its fully supervised counterpart. △ Less

Submitted 19 May, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: IEEE Transactions on Image Processing

arXiv:2205.03001 [pdf, other]

Revisiting Pretraining for Semi-Supervised Learning in the Low-Label Regime

Authors: Xun Xu, Jingyi Liao, Lile Cai, Manh Cuong Nguyen, Kangkang Lu, Wanyue Zhang, Yasin Yazici, Chuan Sheng Foo

Abstract: Semi-supervised learning (SSL) addresses the lack of labeled data by exploiting large unlabeled data through pseudolabeling. However, in the extremely low-label regime, pseudo labels could be incorrect, a.k.a. the confirmation bias, and the pseudo labels will in turn harm the network training. Recent studies combined finetuning (FT) from pretrained weights with SSL to mitigate the challenges and c… ▽ More Semi-supervised learning (SSL) addresses the lack of labeled data by exploiting large unlabeled data through pseudolabeling. However, in the extremely low-label regime, pseudo labels could be incorrect, a.k.a. the confirmation bias, and the pseudo labels will in turn harm the network training. Recent studies combined finetuning (FT) from pretrained weights with SSL to mitigate the challenges and claimed superior results in the low-label regime. In this work, we first show that the better pretrained weights brought in by FT account for the state-of-the-art performance, and importantly that they are universally helpful to off-the-shelf semi-supervised learners. We further argue that direct finetuning from pretrained weights is suboptimal due to covariate shift and propose a contrastive target pretraining step to adapt model weights towards target dataset. We carried out extensive experiments on both classification and segmentation tasks by doing target pretraining then followed by semi-supervised finetuning. The promising results validate the efficacy of target pretraining for SSL, in particular in the low-label regime. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2205.01006 [pdf, other]

Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding

Authors: Xian Shi, Xun Xu, Wanyue Zhang, Xiatian Zhu, Chuan Sheng Foo, Kui Jia

Abstract: Semantic understanding of 3D point cloud relies on learning models with massively annotated data, which, in many cases, are expensive or difficult to collect. This has led to an emerging research interest in semi-supervised learning (SSL) for 3D point cloud. It is commonly assumed in SSL that the unlabeled data are drawn from the same distribution as that of the labeled ones; This assumption, howe… ▽ More Semantic understanding of 3D point cloud relies on learning models with massively annotated data, which, in many cases, are expensive or difficult to collect. This has led to an emerging research interest in semi-supervised learning (SSL) for 3D point cloud. It is commonly assumed in SSL that the unlabeled data are drawn from the same distribution as that of the labeled ones; This assumption, however, rarely holds true in realistic environments. Blindly using out-of-distribution (OOD) unlabeled data could harm SSL performance. In this work, we propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized. To estimate the weights, we adopt a bi-level optimization framework which iteratively optimizes a metaobjective on a held-out validation set and a task-objective on a training set. Faced with the instability of efficient bi-level optimizers, we further propose three regularization techniques to enhance the training stability. Extensive experiments on 3D point cloud classification and segmentation tasks verify the effectiveness of our proposed method. We also demonstrate the feasibility of a more efficient training strategy. △ Less

Submitted 2 May, 2022; originally announced May 2022.

arXiv:2203.08321 [pdf, other]

doi 10.1145/3587937

ADATIME: A Benchmarking Suite for Domain Adaptation on Time Series Data

Authors: Mohamed Ragab, Emadeldeen Eldele, Wee Ling Tan, Chuan-Sheng Foo, Zhenghua Chen, Min Wu, Chee-Keong Kwoh, Xiaoli Li

Abstract: Unsupervised domain adaptation methods aim to generalize well on unlabeled test data that may have a different (shifted) distribution from the training data. Such methods are typically developed on image data, and their application to time series data is less explored. Existing works on time series domain adaptation suffer from inconsistencies in evaluation schemes, datasets, and backbone neural n… ▽ More Unsupervised domain adaptation methods aim to generalize well on unlabeled test data that may have a different (shifted) distribution from the training data. Such methods are typically developed on image data, and their application to time series data is less explored. Existing works on time series domain adaptation suffer from inconsistencies in evaluation schemes, datasets, and backbone neural network architectures. Moreover, labeled target data are often used for model selection, which violates the fundamental assumption of unsupervised domain adaptation. To address these issues, we develop a benchmarking evaluation suite (AdaTime) to systematically and fairly evaluate different domain adaptation methods on time series data. Specifically, we standardize the backbone neural network architectures and benchmarking datasets, while also exploring more realistic model selection approaches that can work with no labeled data or just a few labeled samples. Our evaluation includes adapting state-of-the-art visual domain adaptation methods to time series data as well as the recent methods specifically developed for time series data. We conduct extensive experiments to evaluate 11 state-of-the-art methods on five representative datasets spanning 50 cross-domain scenarios. Our results suggest that with careful selection of hyper-parameters, visual domain adaptation methods are competitive with methods proposed for time series domain adaptation. In addition, we find that hyper-parameters could be selected based on realistic model selection approaches. Our work unveils practical insights for applying domain adaptation methods on time series data and builds a solid foundation for future works in the field. The code is available at \href{https://github.com/emadeldeen24/AdaTime}{github.com/emadeldeen24/AdaTime}. △ Less

Submitted 5 May, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: Accepted in the ACM Transactions on Knowledge Discovery from Data (TKDD)

arXiv:2202.09072 [pdf, other]

doi 10.1103/PhysRevResearch.5.L032011

A stabilization mechanism for many-body localization in two dimensions

Authors: D. C. W. Foo, N. Swain, P. Sengupta, G. Lemarié, S. Adam

Abstract: Experiments in cold atom systems see almost identical signatures of many body localization (MBL) in both one-dimensional ($d=1$) and two-dimensional ($d=2$) systems despite the thermal avalanche hypothesis showing that the MBL phase is unstable for $d>1$. Underpinning the thermal avalanche argument is the assumption of exponential localization of local integrals of motion (LIOMs). In this work we… ▽ More Experiments in cold atom systems see almost identical signatures of many body localization (MBL) in both one-dimensional ($d=1$) and two-dimensional ($d=2$) systems despite the thermal avalanche hypothesis showing that the MBL phase is unstable for $d>1$. Underpinning the thermal avalanche argument is the assumption of exponential localization of local integrals of motion (LIOMs). In this work we demonstrate that addition of a confining potential -- as is typical in experimental setups -- allows a non-interacting disordered system to have super-exponentially (Gaussian) localized wavefunctions, and an interacting disordered system to undergo a localization transition. Moreover, we show that Gaussian localization of MBL LIOMs shifts the quantum avalanche critical dimension from $d=1$ to $d=2$, potentially bridging the divide between the experimental demonstrations of MBL in these systems and existing theoretical arguments that claim that such demonstrations are impossible. △ Less

Submitted 18 February, 2022; originally announced February 2022.

Journal ref: Phys. Rev. Research 5, L032011 (2023)

arXiv:2112.09327 [pdf, other]

Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Authors: Sebastian Shenghong Tay, Xinyi Xu, Chuan Sheng Foo, Bryan Kian Hsiang Low

Abstract: This paper presents a novel collaborative generative modeling (CGM) framework that incentivizes collaboration among self-interested parties to contribute data to a pool for training a generative model (e.g., GAN), from which synthetic data are drawn and distributed to the parties as rewards commensurate to their contributions. Distributing synthetic data as rewards (instead of trained models or mo… ▽ More This paper presents a novel collaborative generative modeling (CGM) framework that incentivizes collaboration among self-interested parties to contribute data to a pool for training a generative model (e.g., GAN), from which synthetic data are drawn and distributed to the parties as rewards commensurate to their contributions. Distributing synthetic data as rewards (instead of trained models or money) offers task- and model-agnostic benefits for downstream learning tasks and is less likely to violate data privacy regulation. To realize the framework, we firstly propose a data valuation function using maximum mean discrepancy (MMD) that values data based on its quantity and quality in terms of its closeness to the true data distribution and provide theoretical results guiding the kernel choice in our MMD-based data valuation function. Then, we formulate the reward scheme as a linear optimization problem that when solved, guarantees certain incentives such as fairness in the CGM framework. We devise a weighted sampling algorithm for generating synthetic data to be distributed to each party as reward such that the value of its data and the synthetic data combined matches its assigned reward value by the reward scheme. We empirically show using simulated and real-world datasets that the parties' synthetic data rewards are commensurate to their contributions. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 36th AAAI Conference on Artificial Intelligence (AAAI 2022), Extended version with derivations, 42 pages

arXiv:2112.06029 [pdf, other]

On Automatic Data Augmentation for 3D Point Cloud Classification

Authors: Wanyue Zhang, Xun Xu, Fayao Liu, Le Zhang, Chuan-Sheng Foo

Abstract: Data augmentation is an important technique to reduce overfitting and improve learning performance, but existing works on data augmentation for 3D point cloud data are based on heuristics. In this work, we instead propose to automatically learn a data augmentation strategy using bilevel optimization. An augmentor is designed in a similar fashion to a conditional generator and is optimized by minim… ▽ More Data augmentation is an important technique to reduce overfitting and improve learning performance, but existing works on data augmentation for 3D point cloud data are based on heuristics. In this work, we instead propose to automatically learn a data augmentation strategy using bilevel optimization. An augmentor is designed in a similar fashion to a conditional generator and is optimized by minimizing a base model's loss on a validation set when the augmented input is used for training the model. This formulation provides a more principled way to learn data augmentation on 3D point clouds. We evaluate our approach on standard point cloud classification tasks and a more challenging setting with pose misalignment between training and validation/test sets. The proposed strategy achieves competitive performance on both tasks and we provide further insight into the augmentor's ability to learn the validation set distribution. △ Less

Submitted 18 December, 2021; v1 submitted 11 December, 2021; originally announced December 2021.

Comments: BMVC 2021

arXiv:2111.04964 [pdf, other]

doi 10.1109/TNNLS.2022.3223018

On Representation Knowledge Distillation for Graph Neural Networks

Authors: Chaitanya K. Joshi, Fayao Liu, Xu Xun, Jie Lin, Chuan-Sheng Foo

Abstract: Knowledge distillation is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the Local Structure Preserving loss (LSP), which matches local structural relationships defined over edges across the student and teacher's node embeddings. This paper studies whether preserving t… ▽ More Knowledge distillation is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the Local Structure Preserving loss (LSP), which matches local structural relationships defined over edges across the student and teacher's node embeddings. This paper studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose Graph Contrastive Representation Distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across 4 datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving variant of LSP) as well as baselines from 2D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other. Our code is available at https://github.com/chaitjo/efficient-gnns △ Less

Submitted 4 February, 2023; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: IEEE Transactions on Neural Networks and Learning Representation (TNNLS), Special Issue on Deep Neural Networks for Graphs: Theory, Models, Algorithms and Applications

arXiv:2109.11428 [pdf, other]

doi 10.1109/TNNLS.2021.3105827

An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series

Authors: Astha Garg, Wenyu Zhang, Jules Samaran, Savitha Ramasamy, Chuan-Sheng Foo

Abstract: Several techniques for multivariate time series anomaly detection have been proposed recently, but a systematic comparison on a common set of datasets and metrics is lacking. This paper presents a systematic and comprehensive evaluation of unsupervised and semi-supervised deep-learning based methods for anomaly detection and diagnosis on multivariate time series data from cyberphysical systems. Un… ▽ More Several techniques for multivariate time series anomaly detection have been proposed recently, but a systematic comparison on a common set of datasets and metrics is lacking. This paper presents a systematic and comprehensive evaluation of unsupervised and semi-supervised deep-learning based methods for anomaly detection and diagnosis on multivariate time series data from cyberphysical systems. Unlike previous works, we vary the model and post-processing of model errors, i.e. the scoring functions independently of each other, through a grid of 10 models and 4 scoring functions, comparing these variants to state of the art methods. In time-series anomaly detection, detecting anomalous events is more important than detecting individual anomalous time-points. Through experiments, we find that the existing evaluation metrics either do not take events into account, or cannot distinguish between a good detector and trivial detectors, such as a random or an all-positive detector. We propose a new metric to overcome these drawbacks, namely, the composite F-score ($Fc_1$), for evaluating time-series anomaly detection. Our study highlights that dynamic scoring functions work much better than static ones for multivariate time series anomaly detection, and the choice of scoring functions often matters more than the choice of the underlying model. We also find that a simple, channel-wise model - the Univariate Fully-Connected Auto-Encoder, with the dynamic Gaussian scoring function emerges as a winning candidate for both anomaly detection and diagnosis, beating state of the art algorithms. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: IEEE Transactions on Neural Networks and Learning Systems

arXiv:2108.04423 [pdf, other]

doi 10.1016/j.media.2021.102148

Semi-supervised classification of radiology images with NoTeacher: A Teacher that is not Mean

Authors: Balagopal Unnikrishnan, Cuong Nguyen, Shafa Balaram, Chao Li, Chuan Sheng Foo, Pavitra Krishnaswamy

Abstract: Deep learning models achieve strong performance for radiology image classification, but their practical application is bottlenecked by the need for large labeled training datasets. Semi-supervised learning (SSL) approaches leverage small labeled datasets alongside larger unlabeled datasets and offer potential for reducing labeling cost. In this work, we introduce NoTeacher, a novel consistency-bas… ▽ More Deep learning models achieve strong performance for radiology image classification, but their practical application is bottlenecked by the need for large labeled training datasets. Semi-supervised learning (SSL) approaches leverage small labeled datasets alongside larger unlabeled datasets and offer potential for reducing labeling cost. In this work, we introduce NoTeacher, a novel consistency-based SSL framework which incorporates probabilistic graphical models. Unlike Mean Teacher which maintains a teacher network updated via a temporal ensemble, NoTeacher employs two independent networks, thereby eliminating the need for a teacher network. We demonstrate how NoTeacher can be customized to handle a range of challenges in radiology image classification. Specifically, we describe adaptations for scenarios with 2D and 3D inputs, uni and multi-label classification, and class distribution mismatch between labeled and unlabeled portions of the training data. In realistic empirical evaluations on three public benchmark datasets spanning the workhorse modalities of radiology (X-Ray, CT, MRI), we show that NoTeacher achieves over 90-95% of the fully supervised AUROC with less than 5-15% labeling budget. Further, NoTeacher outperforms established SSL methods with minimal hyperparameter tuning, and has implications as a principled and practical option for semisupervised learning in radiology applications. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Comments: Preprint submitted to Medical Image Analysis. Accepted in June 2021

MSC Class: 41A05; 41A10; 65D05; 65D17

arXiv:2108.02104 [pdf, other]

Point Discriminative Learning for Data-efficient 3D Point Cloud Analysis

Authors: Fayao Liu, Guosheng Lin, Chuan-Sheng Foo, Chaitanya K. Joshi, Jie Lin

Abstract: 3D point cloud analysis has drawn a lot of research attention due to its wide applications. However, collecting massive labelled 3D point cloud data is both time-consuming and labor-intensive. This calls for data-efficient learning methods. In this work we propose PointDisc, a point discriminative learning method to leverage self-supervisions for data-efficient 3D point cloud classification and se… ▽ More 3D point cloud analysis has drawn a lot of research attention due to its wide applications. However, collecting massive labelled 3D point cloud data is both time-consuming and labor-intensive. This calls for data-efficient learning methods. In this work we propose PointDisc, a point discriminative learning method to leverage self-supervisions for data-efficient 3D point cloud classification and segmentation. PointDisc imposes a novel point discrimination loss on the middle and global level features produced by the backbone network. This point discrimination loss enforces learned features to be consistent with points belonging to the corresponding local shape region and inconsistent with randomly sampled noisy points. We conduct extensive experiments on 3D object classification, 3D semantic and part segmentation, showing the benefits of PointDisc for data-efficient learning. Detailed analysis demonstrate that PointDisc learns unsupervised features that well capture local and global geometry. △ Less

Submitted 20 January, 2023; v1 submitted 4 August, 2021; originally announced August 2021.

Comments: This work is published in 3DV 2022

arXiv:2106.08587 [pdf, other]

Evidence of many-body localization in 2D from quantum Monte Carlo simulation

Authors: Ho-Kin Tang, N. Swain, D. C. W. Foo, B. J. J. Khor, G. Lemarié, F. F. Assaad, S. Adam, P. Sengupta

Abstract: We use the stochastic series expansion quantum Monte Carlo method, together with the eigenstate-to-Hamiltonian construction, to map the localized Bose glass ground state of the disordered two-dimensional Heisenberg model to excited states of new target Hamiltonians. The localized nature of the ground state is established by studying the participation entropy, local entanglement entropy, and local… ▽ More We use the stochastic series expansion quantum Monte Carlo method, together with the eigenstate-to-Hamiltonian construction, to map the localized Bose glass ground state of the disordered two-dimensional Heisenberg model to excited states of new target Hamiltonians. The localized nature of the ground state is established by studying the participation entropy, local entanglement entropy, and local magnetization, all known in the literature to also be identifying characteristics of many-body localized states. Our construction maps the ground state of the parent Hamiltonian to a single excited state of a new target Hamiltonian, which retains the same form as the parent Hamiltonian, albeit with correlated and large disorder. We furthermore provide evidence that the mapped eigenstates are genuine localized states and not special zero-measure localized states like the quantum scar-states. Our results provide concrete evidence for the existence of the many-body localized phase in two dimensions. △ Less

Submitted 3 May, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: (4+ε) pages with 4 figures, and the supplementary material

arXiv:2101.06931 [pdf, other]

Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach

Authors: Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, Kui Jia

Abstract: Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, the success of which relies on the availability of large-scale annotated datasets. However, it can be extremely time-consuming and prohibitively expensive to compile such datasets. In this work, we propose an active learning approach to maximize model performance given limited annotation budgets. We invest… ▽ More Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, the success of which relies on the availability of large-scale annotated datasets. However, it can be extremely time-consuming and prohibitively expensive to compile such datasets. In this work, we propose an active learning approach to maximize model performance given limited annotation budgets. We investigate the appropriate sample granularity for active selection under realistic annotation cost measurement (clicks), and demonstrate that super-point based selection allows for more efficient usage of the limited budget compared to point-level and instance-level selection. We further exploit local consistency constraints to boost the performance of the super-point based approach. We evaluate our methods on two benchmarking datasets (ShapeNet and S3DIS) and the results demonstrate that active learning is an effective strategy to address the high annotation costs in semantic point cloud segmentation. △ Less

Submitted 12 April, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

arXiv:2101.04859 [pdf]

A*HAR: A New Benchmark towards Semi-supervised learning for Class-imbalanced Human Activity Recognition

Authors: Govind Narasimman, Kangkang Lu, Arun Raja, Chuan Sheng Foo, Mohamed Sabry Aly, Jie Lin, Vijay Chandrasekhar

Abstract: Despite the vast literature on Human Activity Recognition (HAR) with wearable inertial sensor data, it is perhaps surprising that there are few studies investigating semisupervised learning for HAR, particularly in a challenging scenario with class imbalance problem. In this work, we present a new benchmark, called A*HAR, towards semisupervised learning for class-imbalanced HAR. We evaluate state-… ▽ More Despite the vast literature on Human Activity Recognition (HAR) with wearable inertial sensor data, it is perhaps surprising that there are few studies investigating semisupervised learning for HAR, particularly in a challenging scenario with class imbalance problem. In this work, we present a new benchmark, called A*HAR, towards semisupervised learning for class-imbalanced HAR. We evaluate state-of-the-art semi-supervised learning method on A*HAR, by combining Mean Teacher and Convolutional Neural Network. Interestingly, we find that Mean Teacher boosts the overall performance when training the classifier with fewer labelled samples and a large amount of unlabeled samples, but the classifier falls short in handling unbalanced activities. These findings lead to an interesting open problem, i.e., development of semi-supervised HAR algorithms that are class-imbalance aware without any prior knowledge on the class distribution for unlabeled samples. The dataset and benchmark evaluation are released at https://github.com/I2RDL2/ASTAR-HAR for future research. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 5 pages, 3 figures

arXiv:2006.14265 [pdf, other]

Empirical Analysis of Overfitting and Mode Drop in GAN Training

Authors: Yasin Yazici, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Vijay Chandrasekhar

Abstract: We examine two key questions in GAN training, namely overfitting and mode drop, from an empirical perspective. We show that when stochasticity is removed from the training procedure, GANs can overfit and exhibit almost no mode drop. Our results shed light on important characteristics of the GAN training procedure. They also provide evidence against prevailing intuitions that GANs do not memorize t… ▽ More We examine two key questions in GAN training, namely overfitting and mode drop, from an empirical perspective. We show that when stochasticity is removed from the training procedure, GANs can overfit and exhibit almost no mode drop. Our results shed light on important characteristics of the GAN training procedure. They also provide evidence against prevailing intuitions that GANs do not memorize the training set, and that mode dropping is mainly due to properties of the GAN objective rather than how it is optimized during training. △ Less

Submitted 25 June, 2020; originally announced June 2020.

Comments: To appear in ICIP2020

arXiv:2004.07543 [pdf, other]

doi 10.1016/j.neucom.2021.10.090

Classify and Generate: Using Classification Latent Space Representations for Image Generations

Authors: Saisubramaniam Gopalakrishnan, Pranshu Ranjan Singh, Yasin Yazici, Chuan-Sheng Foo, Vijay Chandrasekhar, ArulMurugan Ambikapathi

Abstract: Utilization of classification latent space information for downstream reconstruction and generation is an intriguing and a relatively unexplored area. In general, discriminative representations are rich in class-specific features but are too sparse for reconstruction, whereas, in autoencoders the representations are dense but have limited indistinguishable class-specific features, making them less… ▽ More Utilization of classification latent space information for downstream reconstruction and generation is an intriguing and a relatively unexplored area. In general, discriminative representations are rich in class-specific features but are too sparse for reconstruction, whereas, in autoencoders the representations are dense but have limited indistinguishable class-specific features, making them less suitable for classification. In this work, we propose a discriminative modeling framework that employs manipulated supervised latent representations to reconstruct and generate new samples belonging to a given class. Unlike generative modeling approaches such as GANs and VAEs that aim to model the data manifold distribution, Representation based Generations (ReGene) directly represent the given data manifold in the classification space. Such supervised representations, under certain constraints, allow for reconstructions and controlled generations using an appropriate decoder without enforcing any prior distribution. Theoretically, given a class, we show that these representations when smartly manipulated using convex combinations retain the same class label. Furthermore, they also lead to the novel generation of visually realistic images. Extensive experiments on datasets of varying resolutions demonstrate that ReGene has higher classification accuracy than existing conditional generative models while being competitive in terms of FID. △ Less

Submitted 14 December, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

Journal ref: Saisubramaniam Gopalakrishnan, Pranshu Ranjan Singh et. al., Classify and generate: Using classification latent space representations for image generations, Neurocomputing, Volume 471, 2022, Pages 296-334, ISSN 0925-2312

arXiv:2002.06015 [pdf, other]

Scalable and Practical Natural Gradient for Large-Scale Deep Learning

Authors: Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Chuan-Sheng Foo, Rio Yokota

Abstract: Large-scale distributed training of deep neural networks results in models with worse generalization performance as a result of the increase in the effective mini-batch size. Previous approaches attempt to address this problem by varying the learning rate and batch size over epochs and layers, or ad hoc modifications of batch normalization. We propose Scalable and Practical Natural Gradient Descen… ▽ More Large-scale distributed training of deep neural networks results in models with worse generalization performance as a result of the increase in the effective mini-batch size. Previous approaches attempt to address this problem by varying the learning rate and batch size over epochs and layers, or ad hoc modifications of batch normalization. We propose Scalable and Practical Natural Gradient Descent (SP-NGD), a principled approach for training models that allows them to attain similar generalization performance to models trained with first-order optimization methods, but with accelerated convergence. Furthermore, SP-NGD scales to large mini-batch sizes with a negligible computational overhead as compared to first-order methods. We evaluated SP-NGD on a benchmark task where highly optimized first-order methods are available as references: training a ResNet-50 model for image classification on ImageNet. We demonstrate convergence to a top-1 validation accuracy of 75.4% in 5.5 minutes using a mini-batch size of 32,768 with 1,024 GPUs, as well as an accuracy of 74.9% with an extremely large mini-batch size of 131,072 in 873 steps of SP-NGD. △ Less

Submitted 13 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:1811.12019

arXiv:1912.10364 [pdf, other]

Learning to Impute: A General Framework for Semi-supervised Learning

Authors: Wei-Hong Li, Chuan-Sheng Foo, Hakan Bilen

Abstract: Recent semi-supervised learning methods have shown to achieve comparable results to their supervised counterparts while using only a small portion of labels in image classification tasks thanks to their regularization strategies. In this paper, we take a more direct approach for semi-supervised learning and propose learning to impute the labels of unlabeled samples such that a network achieves bet… ▽ More Recent semi-supervised learning methods have shown to achieve comparable results to their supervised counterparts while using only a small portion of labels in image classification tasks thanks to their regularization strategies. In this paper, we take a more direct approach for semi-supervised learning and propose learning to impute the labels of unlabeled samples such that a network achieves better generalization when it is trained on these labels. We pose the problem in a learning-to-learn formulation which can easily be incorporated to the state-of-the-art semi-supervised techniques and boost their performance especially when the labels are limited. We demonstrate that our method is applicable to both classification and regression problems including image classification and facial landmark detection tasks. △ Less

Submitted 24 September, 2020; v1 submitted 21 December, 2019; originally announced December 2019.

Comments: Semi-supervised Learning, Meta-Learning, Learning to impute

arXiv:1910.13582 [pdf, other]

doi 10.1103/PhysRevA.100.063602

Diffusion Monte Carlo study of a spin-imbalanced two-dimensional Fermi gas with attractive interactions

Authors: D. C. W. Foo, G. J. Conduit

Abstract: We probe the superconducting gap in the zero temperature ground state of an attractively interacting spin-imbalanced two-dimensional Fermi gas with Diffusion Monte Carlo. A condensate fraction at nonzero pair momentum evidences a spatially non-uniform superconducting order parameter. Comparison with exact diagonalisation studies confirms that the nonzero condensate fraction across a range of nonze… ▽ More We probe the superconducting gap in the zero temperature ground state of an attractively interacting spin-imbalanced two-dimensional Fermi gas with Diffusion Monte Carlo. A condensate fraction at nonzero pair momentum evidences a spatially non-uniform superconducting order parameter. Comparison with exact diagonalisation studies confirms that the nonzero condensate fraction across a range of nonzero fermion pair momenta is consistent with non-exclusive pairing between majority and minority fermions, an extension beyond FFLO theory. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Journal ref: Phys. Rev. A 100, 063602 (2019)

arXiv:1902.03444 [pdf, other]

Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions

Authors: Yasin Yazıcı, Bruno Lecouat, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, Vijay Chandrasekhar

Abstract: We propose a GAN design which models multiple distributions effectively and discovers their commonalities and particularities. Each data distribution is modeled with a mixture of $K$ generator distributions. As the generators are partially shared between the modeling of different true data distributions, shared ones captures the commonality of the distributions, while non-shared ones capture uniqu… ▽ More We propose a GAN design which models multiple distributions effectively and discovers their commonalities and particularities. Each data distribution is modeled with a mixture of $K$ generator distributions. As the generators are partially shared between the modeling of different true data distributions, shared ones captures the commonality of the distributions, while non-shared ones capture unique aspects of them. We show the effectiveness of our method on various datasets (MNIST, Fashion MNIST, CIFAR-10, Omniglot, CelebA) with compelling results. △ Less

Submitted 9 February, 2019; originally announced February 2019.

arXiv:1812.07832 [pdf, other]

Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images

Authors: Bruno Lecouat, Ken Chang, Chuan-Sheng Foo, Balagopal Unnikrishnan, James M. Brown, Houssam Zenati, Andrew Beers, Vijay Chandrasekhar, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy

Abstract: Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to dis… ▽ More Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to discern fine-scale, sparse or localized features that define medical abnormalities. To overcome these limitations, we propose a patch-based semi-supervised learning approach and evaluate performance on classification of diabetic retinopathy from funduscopic images. Our semi-supervised approach achieves high AUC with just 10-20 labeled training images, and outperforms the supervised baselines by upto 15% when less than 30% of the training dataset is labeled. Further, our method implicitly enables interpretation of the SSL predictions. As this approach enables good accuracy, resolution and interpretability with lower annotation burden, it sets the pathway for scalable applications of deep learning in clinical imaging. △ Less

Submitted 19 December, 2018; originally announced December 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

Report number: ML4H/2018/227

arXiv:1812.02288 [pdf, other]

Adversarially Learned Anomaly Detection

Authors: Houssam Zenati, Manon Romain, Chuan Sheng Foo, Bruno Lecouat, Vijay Ramaseshan Chandrasekhar

Abstract: Anomaly detection is a significant and hence well-studied problem. However, developing effective anomaly detection methods for complex and high-dimensional data remains a challenge. As Generative Adversarial Networks (GANs) are able to model the complex high-dimensional distributions of real-world data, they offer a promising approach to address this challenge. In this work, we propose an anomaly… ▽ More Anomaly detection is a significant and hence well-studied problem. However, developing effective anomaly detection methods for complex and high-dimensional data remains a challenge. As Generative Adversarial Networks (GANs) are able to model the complex high-dimensional distributions of real-world data, they offer a promising approach to address this challenge. In this work, we propose an anomaly detection method, Adversarially Learned Anomaly Detection (ALAD) based on bi-directional GANs, that derives adversarially learned features for the anomaly detection task. ALAD then uses reconstruction errors based on these adversarially learned features to determine if a data sample is anomalous. ALAD builds on recent advances to ensure data-space and latent-space cycle-consistencies and stabilize GAN training, which results in significantly improved anomaly detection performance. ALAD achieves state-of-the-art performance on a range of image and tabular datasets while being several hundred-fold faster at test time than the only published GAN-based method. △ Less

Submitted 5 December, 2018; originally announced December 2018.

Comments: In the Proceedings of the 20th IEEE International Conference on Data Mining (ICDM), 2018

arXiv:1811.12065 [pdf, other]

TEA-DNN: the Quest for Time-Energy-Accuracy Co-optimized Deep Neural Networks

Authors: Lile Cai, Anne-Maelle Barneche, Arthur Herbout, Chuan Sheng Foo, Jie Lin, Vijay Ramaseshan Chandrasekhar, Mohamed M. Sabry

Abstract: Embedded deep learning platforms have witnessed two simultaneous improvements. First, the accuracy of convolutional neural networks (CNNs) has been significantly improved through the use of automated neural-architecture search (NAS) algorithms to determine CNN structure. Second, there has been increasing interest in developing hardware accelerators for CNNs that provide improved inference performa… ▽ More Embedded deep learning platforms have witnessed two simultaneous improvements. First, the accuracy of convolutional neural networks (CNNs) has been significantly improved through the use of automated neural-architecture search (NAS) algorithms to determine CNN structure. Second, there has been increasing interest in developing hardware accelerators for CNNs that provide improved inference performance and energy consumption compared to GPUs. Such embedded deep learning platforms differ in the amount of compute resources and memory-access bandwidth, which would affect performance and energy consumption of CNNs. It is therefore critical to consider the available hardware resources in the network architecture search. To this end, we introduce TEA-DNN, a NAS algorithm targeting multi-objective optimization of execution time, energy consumption, and classification accuracy of CNN workloads on embedded architectures. TEA-DNN leverages energy and execution time measurements on embedded hardware when exploring the Pareto-optimal curves across accuracy, execution time, and energy consumption and does not require additional effort to model the underlying hardware. We apply TEA-DNN for image classification on actual embedded platforms (NVIDIA Jetson TX2 and Intel Movidius Neural Compute Stick). We highlight the Pareto-optimal operating points that emphasize the necessity to explicitly consider hardware characteristics in the search process. To the best of our knowledge, this is the most comprehensive study of Pareto-optimal models across a range of hardware platforms using actual measurements on hardware to obtain objective values. △ Less

Submitted 21 October, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

Comments: Accepted by ISLPED2019

arXiv:1811.06219 [pdf, other]

Predicting thermoelectric properties from crystal graphs and material descriptors - first application for functional materials

Authors: Leo Laugier, Daniil Bash, Jose Recatala, Hong Kuan Ng, Savitha Ramasamy, Chuan-Sheng Foo, Vijay R Chandrasekhar, Kedar Hippalgaonkar

Abstract: We introduce the use of Crystal Graph Convolutional Neural Networks (CGCNN), Fully Connected Neural Networks (FCNN) and XGBoost to predict thermoelectric properties. The dataset for the CGCNN is independent of Density Functional Theory (DFT) and only relies on the crystal and atomic information, while that for the FCNN is based on a rich attribute list mined from Materialsproject.org. The results… ▽ More We introduce the use of Crystal Graph Convolutional Neural Networks (CGCNN), Fully Connected Neural Networks (FCNN) and XGBoost to predict thermoelectric properties. The dataset for the CGCNN is independent of Density Functional Theory (DFT) and only relies on the crystal and atomic information, while that for the FCNN is based on a rich attribute list mined from Materialsproject.org. The results show that the optimized FCNN is three layer deep and is able to predict the scattering-time independent thermoelectric powerfactor much better than the CGCNN (or XGBoost), suggesting that bonding and density of states descriptors informed from materials science knowledge obtained partially from DFT are vital to predict functional properties. △ Less

Submitted 15 November, 2018; originally announced November 2018.

arXiv:1811.04595 [pdf, other]

Holistic Multi-modal Memory Network for Movie Question Answering

Authors: Anran Wang, Anh Tuan Luu, Chuan-Sheng Foo, Hongyuan Zhu, Yi Tay, Vijay Chandrasekhar

Abstract: Answering questions according to multi-modal context is a challenging problem as it requires a deep integration of different data sources. Existing approaches only employ partial interactions among data sources in one attention hop. In this paper, we present the Holistic Multi-modal Memory Network (HMMN) framework which fully considers the interactions between different input sources (multi-modal… ▽ More Answering questions according to multi-modal context is a challenging problem as it requires a deep integration of different data sources. Existing approaches only employ partial interactions among data sources in one attention hop. In this paper, we present the Holistic Multi-modal Memory Network (HMMN) framework which fully considers the interactions between different input sources (multi-modal context, question) in each hop. In addition, it takes answer choices into consideration during the context retrieval stage. Therefore, the proposed framework effectively integrates multi-modal context, question, and answer information, which leads to more informative context retrieved for question answering. Our HMMN framework achieves state-of-the-art accuracy on MovieQA dataset. Extensive ablation studies show the importance of holistic reasoning and contributions of different attention strategies. △ Less

Submitted 12 November, 2018; originally announced November 2018.

Showing 1–50 of 56 results for author: Foo, C