subscribe to arXiv mailings

Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers

Authors: Terry Tong, Jiashu Xu, Qin Liu, Muhao Chen

Abstract: The security of multi-turn conversational large language models (LLMs) is understudied despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to data poisoning backdoor attacks, where an adversary manipulates the training data to cause the model to output malicious responses to predefined triggers. Specific to the multi-turn dialogue setting, LLMs are at the ri… ▽ More The security of multi-turn conversational large language models (LLMs) is understudied despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to data poisoning backdoor attacks, where an adversary manipulates the training data to cause the model to output malicious responses to predefined triggers. Specific to the multi-turn dialogue setting, LLMs are at the risk of even more harmful and stealthy backdoor attacks where the backdoor triggers may span across multiple utterances, giving lee-way to context-driven attacks. In this paper, we explore a novel distributed backdoor trigger attack that serves to be an extra tool in an adversary's toolbox that can interface with other single-turn attack strategies in a plug and play manner. Results on two representative defense mechanisms indicate that distributed backdoor triggers are robust against existing defense strategies which are designed for single-turn user-model interactions, motivating us to propose a new defense strategy for the multi-turn dialogue setting that is more challenging. To this end, we also explore a novel contrastive decoding based defense that is able to mitigate the backdoor with a low computational tradeoff. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to EMNLP 2024

arXiv:2407.03598 [pdf, other]

ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution

Authors: Yuanbo Zhou, Yuyang Xue, Wei Deng, Xinlin Zhang, Qinquan Gao, Tong Tong

Abstract: Despite advances in the paradigm of pre-training then fine-tuning in low-level vision tasks, significant challenges persist particularly regarding the increased size of pre-trained models such as memory usage and training time. Another concern often encountered is the unsatisfying results yielded when directly applying pre-trained single-image models to multi-image domain. In this paper, we propos… ▽ More Despite advances in the paradigm of pre-training then fine-tuning in low-level vision tasks, significant challenges persist particularly regarding the increased size of pre-trained models such as memory usage and training time. Another concern often encountered is the unsatisfying results yielded when directly applying pre-trained single-image models to multi-image domain. In this paper, we propose a efficient method for transferring a pre-trained single-image super-resolution (SISR) transformer network to the domain of stereo image super-resolution (SteISR) through a parameter-efficient fine-tuning (PEFT) method. Specifically, we introduce the concept of stereo adapters and spatial adapters which are incorporated into the pre-trained SISR transformer network. Subsequently, the pre-trained SISR model is frozen, enabling us to fine-tune the adapters using stereo datasets along. By adopting this training method, we enhance the ability of the SISR model to accurately infer stereo images by 0.79dB on the Flickr1024 dataset. This method allows us to train only 4.8% of the original model parameters, achieving state-of-the-art performance on four commonly used SteISR benchmarks. Compared to the more complicated full fine-tuning approach, our method reduces training time and memory consumption by 57% and 15%, respectively. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.00914 [pdf, other]

Wasserstein gradient flow for optimal probability measure decomposition

Authors: Jiangze Han, Christopher Thomas Ryan, Xin T. Tong

Abstract: We examine the infinite-dimensional optimization problem of finding a decomposition of a probability measure into K probability sub-measures to minimize specific loss functions inspired by applications in clustering and user grouping. We analytically explore the structures of the support of optimal sub-measures and introduce algorithms based on Wasserstein gradient flow, demonstrating their conver… ▽ More We examine the infinite-dimensional optimization problem of finding a decomposition of a probability measure into K probability sub-measures to minimize specific loss functions inspired by applications in clustering and user grouping. We analytically explore the structures of the support of optimal sub-measures and introduce algorithms based on Wasserstein gradient flow, demonstrating their convergence. Numerical results illustrate the implementability of our algorithms and provide further insights. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.03082 [pdf, other]

Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning

Authors: Tianchen Zhou, FNU Hairi, Haibo Yang, Jia Liu, Tian Tong, Fan Yang, Michinari Momma, Yan Gao

Abstract: Reinforcement learning with multiple, potentially conflicting objectives is pervasive in real-world applications, while this problem remains theoretically under-explored. This paper tackles the multi-objective reinforcement learning (MORL) problem and introduces an innovative actor-critic algorithm named MOAC which finds a policy by iteratively making trade-offs among conflicting reward signals. N… ▽ More Reinforcement learning with multiple, potentially conflicting objectives is pervasive in real-world applications, while this problem remains theoretically under-explored. This paper tackles the multi-objective reinforcement learning (MORL) problem and introduces an innovative actor-critic algorithm named MOAC which finds a policy by iteratively making trade-offs among conflicting reward signals. Notably, we provide the first analysis of finite-time Pareto-stationary convergence and corresponding sample complexity in both discounted and average reward settings. Our approach has two salient features: (a) MOAC mitigates the cumulative estimation bias resulting from finding an optimal common gradient descent direction out of stochastic samples. This enables provable convergence rate and sample complexity guarantees independent of the number of objectives; (b) With proper momentum coefficient, MOAC initializes the weights of individual policy gradients using samples from the environment, instead of manual initialization. This enhances the practicality and robustness of our algorithm. Finally, experiments conducted on a real-world dataset validate the effectiveness of our proposed method. △ Less

Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: Accepted in ICML 2024

arXiv:2404.16484 [pdf, other]

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: CVPR 2024, AI for Streaming (AIS) Workshop

arXiv:2404.14248 [pdf, other]

NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi Jin, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, Jing Lin, Alan Yuille, Ben Shao, Jin Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlighting, extreme darkness, and night scenes. A notable total of 428 participants registered for the challenge, with 22 teams ultimately making valid submissions. This paper meticulously evaluates the state-of-the-art advancements in enhancing low-light images, reflecting the significant progress and creativity in this field. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: NTIRE 2024 Challenge Report

arXiv:2403.15803 [pdf, other]

Innovative Quantitative Analysis for Disease Progression Assessment in Familial Cerebral Cavernous Malformations

Authors: Ruige Zong, Tao Wang, Chunwang Li, Xinlin Zhang, Yuanbin Chen, Longxuan Zhao, Qixuan Li, Qinquan Gao, Dezhi Kang, Fuxin Lin, Tong Tong

Abstract: Familial cerebral cavernous malformation (FCCM) is a hereditary disorder characterized by abnormal vascular structures within the central nervous system. The FCCM lesions are often numerous and intricate, making quantitative analysis of the lesions a labor-intensive task. Consequently, clinicians face challenges in quantitatively assessing the severity of lesions and determining whether lesions ha… ▽ More Familial cerebral cavernous malformation (FCCM) is a hereditary disorder characterized by abnormal vascular structures within the central nervous system. The FCCM lesions are often numerous and intricate, making quantitative analysis of the lesions a labor-intensive task. Consequently, clinicians face challenges in quantitatively assessing the severity of lesions and determining whether lesions have progressed. To alleviate this problem, we propose a quantitative statistical framework for FCCM, comprising an efficient annotation module, an FCCM lesion segmentation module, and an FCCM lesion quantitative statistics module. Our framework demonstrates precise segmentation of the FCCM lesion based on efficient data annotation, achieving a Dice coefficient of 93.22\%. More importantly, we focus on quantitative statistics of lesions, which is combined with image registration to realize the quantitative comparison of lesions between different examinations of patients, and a visualization framework has been established for doctors to comprehensively compare and analyze lesions. The experimental results have demonstrated that our proposed framework not only obtains objective, accurate, and comprehensive quantitative statistical information, which provides a quantitative assessment method for disease progression and drug efficacy study, but also considerably reduces the manual measurement and statistical workload of lesions, assisting clinical decision-making for FCCM and accelerating progress in FCCM clinical research. This highlights the potential of practical application of the framework in FCCM clinical research and clinical decision-making. The codes are available at https://github.com/6zrg/Quantitative-Statistics-of-FCCM. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2312.17538 [pdf, other]

Distance Guided Generative Adversarial Network for Explainable Binary Classifications

Authors: Xiangyu Xiong, Yue Sun, Xiaohong Liu, Wei Ke, Chan-Tong Lam, Jiangang Chen, Mingfeng Jiang, Mingwei Wang, Hui Xie, Tong Tong, Qinquan Gao, Hao Chen, Tao Tan

Abstract: Despite the potential benefits of data augmentation for mitigating the data insufficiency, traditional augmentation methods primarily rely on the prior intra-domain knowledge. On the other hand, advanced generative adversarial networks (GANs) generate inter-domain samples with limited variety. These previous methods make limited contributions to describing the decision boundaries for binary classi… ▽ More Despite the potential benefits of data augmentation for mitigating the data insufficiency, traditional augmentation methods primarily rely on the prior intra-domain knowledge. On the other hand, advanced generative adversarial networks (GANs) generate inter-domain samples with limited variety. These previous methods make limited contributions to describing the decision boundaries for binary classification. In this paper, we propose a distance guided GAN (DisGAN) which controls the variation degrees of generated samples in the hyperplane space. Specifically, we instantiate the idea of DisGAN by combining two ways. The first way is vertical distance GAN (VerDisGAN) where the inter-domain generation is conditioned on the vertical distances. The second way is horizontal distance GAN (HorDisGAN) where the intra-domain generation is conditioned on the horizontal distances. Furthermore, VerDisGAN can produce the class-specific regions by mapping the source images to the hyperplane. Experimental results show that DisGAN consistently outperforms the GAN-based augmentation methods with explainable binary classification. The proposed method can apply to different classification architectures and has potential to extend to multi-class classification. △ Less

Submitted 29 December, 2023; originally announced December 2023.

Comments: 12 pages, 8 figures. This work has been submitted to the IEEE TNNLS for possible publication. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media

arXiv:2312.07934 [pdf, other]

Toward Real World Stereo Image Super-Resolution via Hybrid Degradation Model and Discriminator for Implied Stereo Image Information

Authors: Yuanbo Zhou, Yuyang Xue, Jiang Bi, Wenlin He, Xinlin Zhang, Jiajun Zhang, Wei Deng, Ruofeng Nie, Junlin Lan, Qinquan Gao, Tong Tong

Abstract: Real-world stereo image super-resolution has a significant influence on enhancing the performance of computer vision systems. Although existing methods for single-image super-resolution can be applied to improve stereo images, these methods often introduce notable modifications to the inherent disparity, resulting in a loss in the consistency of disparity between the original and the enhanced ster… ▽ More Real-world stereo image super-resolution has a significant influence on enhancing the performance of computer vision systems. Although existing methods for single-image super-resolution can be applied to improve stereo images, these methods often introduce notable modifications to the inherent disparity, resulting in a loss in the consistency of disparity between the original and the enhanced stereo images. To overcome this limitation, this paper proposes a novel approach that integrates a implicit stereo information discriminator and a hybrid degradation model. This combination ensures effective enhancement while preserving disparity consistency. The proposed method bridges the gap between the complex degradations in real-world stereo domain and the simpler degradations in real-world single-image super-resolution domain. Our results demonstrate impressive performance on synthetic and real datasets, enhancing visual perception while maintaining disparity consistency. The complete code is available at the following \href{https://github.com/fzuzyb/SCGLANet}{link}. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.02585 [pdf, other]

CVE representation to build attack positions graphs

Authors: Manuel Poisson, Valérie Viet Triem Tong, Gilles Guette, Frédéric Guihéry, Damien Crémilleux

Abstract: In cybersecurity, CVEs (Common Vulnerabilities and Exposures) are publicly disclosed hardware or software vulnerabilities. These vulnerabilities are documented and listed in the NVD database maintained by the NIST. Knowledge of the CVEs impacting an information system provides a measure of its level of security. This article points out that these vulnerabilities should be described in greater deta… ▽ More In cybersecurity, CVEs (Common Vulnerabilities and Exposures) are publicly disclosed hardware or software vulnerabilities. These vulnerabilities are documented and listed in the NVD database maintained by the NIST. Knowledge of the CVEs impacting an information system provides a measure of its level of security. This article points out that these vulnerabilities should be described in greater detail to understand how they could be chained together in a complete attack scenario. This article presents the first proposal for the CAPG format, which is a method for representing a CVE vulnerability, a corresponding exploit, and associated attack positions. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Journal ref: CyberHunt 2023, Workshop on Cyber Threat Intelligence and Hunting, IEEE BigData, Dec 2023, Sorrento, Italy. pp.1-5

arXiv:2311.14388 [pdf, other]

A Parameterized Generative Adversarial Network Using Cyclic Projection for Explainable Medical Image Classification

Authors: Xiangyu Xiong, Yue Sun, Xiaohong Liu, Chan-Tong Lam, Tong Tong, Hao Chen, Qinquan Gao, Wei Ke, Tao Tan

Abstract: Although current data augmentation methods are successful to alleviate the data insufficiency, conventional augmentation are primarily intra-domain while advanced generative adversarial networks (GANs) generate images remaining uncertain, particularly in small-scale datasets. In this paper, we propose a parameterized GAN (ParaGAN) that effectively controls the changes of synthetic samples among do… ▽ More Although current data augmentation methods are successful to alleviate the data insufficiency, conventional augmentation are primarily intra-domain while advanced generative adversarial networks (GANs) generate images remaining uncertain, particularly in small-scale datasets. In this paper, we propose a parameterized GAN (ParaGAN) that effectively controls the changes of synthetic samples among domains and highlights the attention regions for downstream classification. Specifically, ParaGAN incorporates projection distance parameters in cyclic projection and projects the source images to the decision boundary to obtain the class-difference maps. Our experiments show that ParaGAN can consistently outperform the existing augmentation methods with explainable classification on two small-scale medical datasets. △ Less

Submitted 14 December, 2023; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: 5 pages, 4 figures. This work has been submitted to the IEEE ICASSP for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2311.10349 [pdf, other]

Pseudo Label-Guided Data Fusion and Output Consistency for Semi-Supervised Medical Image Segmentation

Authors: Tao Wang, Yuanbin Chen, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Bizhe Bai, Tao Tan, Min Du, Qinquan Gao, Tong Tong

Abstract: Supervised learning algorithms based on Convolutional Neural Networks have become the benchmark for medical image segmentation tasks, but their effectiveness heavily relies on a large amount of labeled data. However, annotating medical image datasets is a laborious and time-consuming process. Inspired by semi-supervised algorithms that use both labeled and unlabeled data for training, we propose t… ▽ More Supervised learning algorithms based on Convolutional Neural Networks have become the benchmark for medical image segmentation tasks, but their effectiveness heavily relies on a large amount of labeled data. However, annotating medical image datasets is a laborious and time-consuming process. Inspired by semi-supervised algorithms that use both labeled and unlabeled data for training, we propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation. We propose a novel pseudo-label utilization scheme, which combines labeled and unlabeled data to augment the dataset effectively. Additionally, we enforce the consistency between different scales in the decoder module of the segmentation network and propose a loss function suitable for evaluating the consistency. Moreover, we incorporate a sharpening operation on the predicted results, further enhancing the accuracy of the segmentation. Extensive experiments on three publicly available datasets demonstrate that the PLGDF framework can largely improve performance by incorporating the unlabeled data. Meanwhile, our framework yields superior performance compared to six state-of-the-art semi-supervised learning methods. The codes of this study are available at https://github.com/ortonwang/PLGDF. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2310.06159 [pdf, other]

Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization

Authors: Cong Ma, Xingyu Xu, Tian Tong, Yuejie Chi

Abstract: Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent (GD) to recover the low-rank factors directly, which… ▽ More Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent (GD) to recover the low-rank factors directly, which allow for small memory and computation footprints. However, the convergence rate of GD depends linearly, and sometimes even quadratically, on the condition number of the low-rank object, and therefore, GD slows down painstakingly when the problem is ill-conditioned. This chapter introduces a new algorithmic approach, dubbed scaled gradient descent (ScaledGD), that provably converges linearly at a constant rate independent of the condition number of the low-rank object, while maintaining the low per-iteration cost of gradient descent for a variety of tasks including sensing, robust principal component analysis and completion. In addition, ScaledGD continues to admit fast global convergence to the minimax-optimal solution, again almost independent of the condition number, from a small random initialization when the rank is over-specified in the presence of Gaussian noise. In total, ScaledGD highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the symmetry in low-rank factorization without hurting generalization. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: Book chapter for "Explorations in the Mathematics of Data Science - The Inaugural Volume of the Center for Approximation and Mathematical Data Analytics". arXiv admin note: text overlap with arXiv:2104.14526

arXiv:2308.16573 [pdf, other]

Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation for Semi-Supervised Medical Image Segmentation

Authors: Yuanbin Chen, Tao Wang, Hui Tang, Longxuan Zhao, Ruige Zong, Shun Chen, Tao Tan, Xinlin Zhang, Tong Tong

Abstract: While supervised learning has achieved remarkable success, obtaining large-scale labeled datasets in biomedical imaging is often impractical due to high costs and the time-consuming annotations required from radiologists. Semi-supervised learning emerges as an effective strategy to overcome this limitation by leveraging useful information from unlabeled datasets. In this paper, we present a novel… ▽ More While supervised learning has achieved remarkable success, obtaining large-scale labeled datasets in biomedical imaging is often impractical due to high costs and the time-consuming annotations required from radiologists. Semi-supervised learning emerges as an effective strategy to overcome this limitation by leveraging useful information from unlabeled datasets. In this paper, we present a novel semi-supervised learning method, Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation (DCPA), for medical image segmentation. We devise a consistency regularization to promote consistent representations during the training process. Specifically, we use distinct decoders for student and teacher networks while maintain the same encoder. Moreover, to learn from unlabeled data, we create pseudo-labels generated by the teacher networks and augment the training data with the pseudo-labels. Both techniques contribute to enhancing the performance of the proposed method. The method is evaluated on three representative medical image segmentation datasets. Comprehensive comparisons with state-of-the-art semi-supervised medical image segmentation methods were conducted under typical scenarios, utilizing 10% and 20% labeled data, as well as in the extreme scenario of only 5% labeled data. The experimental results consistently demonstrate the superior performance of our method compared to other methods across the three semi-supervised settings. The source code is publicly available at https://github.com/BinYCn/DCPA.git. △ Less

Submitted 18 January, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

arXiv:2306.16918 [pdf, other]

PCDAL: A Perturbation Consistency-Driven Active Learning Approach for Medical Image Segmentation and Classification

Authors: Tao Wang, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Tao Tan, Min Du, Qinquan Gao, Tong Tong

Abstract: In recent years, deep learning has become a breakthrough technique in assisting medical image diagnosis. Supervised learning using convolutional neural networks (CNN) provides state-of-the-art performance and has served as a benchmark for various medical image segmentation and classification. However, supervised learning deeply relies on large-scale annotated data, which is expensive, time-consumi… ▽ More In recent years, deep learning has become a breakthrough technique in assisting medical image diagnosis. Supervised learning using convolutional neural networks (CNN) provides state-of-the-art performance and has served as a benchmark for various medical image segmentation and classification. However, supervised learning deeply relies on large-scale annotated data, which is expensive, time-consuming, and even impractical to acquire in medical imaging applications. Active Learning (AL) methods have been widely applied in natural image classification tasks to reduce annotation costs by selecting more valuable examples from the unlabeled data pool. However, their application in medical image segmentation tasks is limited, and there is currently no effective and universal AL-based method specifically designed for 3D medical image segmentation. To address this limitation, we propose an AL-based method that can be simultaneously applied to 2D medical image classification, segmentation, and 3D medical image segmentation tasks. We extensively validated our proposed active learning method on three publicly available and challenging medical image datasets, Kvasir Dataset, COVID-19 Infection Segmentation Dataset, and BraTS2019 Dataset. The experimental results demonstrate that our PCDAL can achieve significantly improved performance with fewer annotations in 2D classification and segmentation and 3D segmentation tasks. The codes of this study are available at https://github.com/ortonwang/PCDAL. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2303.17373 [pdf, other]

URSID: Using formalism to Refine attack Scenarios for vulnerable Infrastructure Deployment

Authors: Pierre-Victor Besson, Valérie Viet Triem Tong, Gilles Guette, Guillaume Piolle, Erwan Abgrall

Abstract: In this paper we propose a novel way of deploying vulnerable architectures for defense and research purposes, which aims to generate deception platforms based on the formal description of a scenario. An attack scenario is described by an attack graph in which transitions are labeled by ATT&CK techniques or procedures. The state of the attacker is modeled as a set of secrets he acquires and a set o… ▽ More In this paper we propose a novel way of deploying vulnerable architectures for defense and research purposes, which aims to generate deception platforms based on the formal description of a scenario. An attack scenario is described by an attack graph in which transitions are labeled by ATT&CK techniques or procedures. The state of the attacker is modeled as a set of secrets he acquires and a set of nodes he controls. Descriptions of a single scenario on a technical level can then be declined into several different scenarios on a procedural level, and each of these scenarios can be deployed into its own vulnerable architecture. To achieve this goal we introduce the notion of architecture constraints, as some procedures may only be exploited on system presenting special properties, such as having a specific operating system version. Finally, we present our deployment process for converting one of these scenarios into a vulnerable infrastructure, and offer an online proof of concept demonstration of our tool, where readers may deploy locally deploy a complete scenario inspired by the threat actor APT-29. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: 13 pages, 9 figures

arXiv:2211.13955 [pdf, other]

MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention

Authors: Wenxuan Zeng, Meng Li, Wenjie Xiong, Tong Tong, Wen-jie Lu, Jin Tan, Runsheng Wang, Ru Huang

Abstract: Secure multi-party computation (MPC) enables computation directly on encrypted data and protects both data and model privacy in deep learning inference. However, existing neural network architectures, including Vision Transformers (ViTs), are not designed or optimized for MPC and incur significant latency overhead. We observe Softmax accounts for the major latency bottleneck due to a high communic… ▽ More Secure multi-party computation (MPC) enables computation directly on encrypted data and protects both data and model privacy in deep learning inference. However, existing neural network architectures, including Vision Transformers (ViTs), are not designed or optimized for MPC and incur significant latency overhead. We observe Softmax accounts for the major latency bottleneck due to a high communication complexity, but can be selectively replaced or linearized without compromising the model accuracy. Hence, in this paper, we propose an MPC-friendly ViT, dubbed MPCViT, to enable accurate yet efficient ViT inference in MPC. Based on a systematic latency and accuracy evaluation of the Softmax attention and other attention variants, we propose a heterogeneous attention optimization space. We also develop a simple yet effective MPC-aware neural architecture search algorithm for fast Pareto optimization. To further boost the inference efficiency, we propose MPCViT+, to jointly optimize the Softmax attention and other network components, including GeLU, matrix multiplication, etc. With extensive experiments, we demonstrate that MPCViT achieves 1.9%, 1.3% and 3.6% higher accuracy with 6.2x, 2.9x and 1.9x latency reduction compared with baseline ViT, MPCFormer and THE-X on the Tiny-ImageNet dataset, respectively. MPCViT+ further achieves a better Pareto front compared with MPCViT. The code and models for evaluation are available at https://github.com/PKU-SEC-Lab/mpcvit. △ Less

Submitted 19 August, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

Comments: Accepted by ICCV 2023 conference

arXiv:2210.06447 [pdf, other]

Sampling in Constrained Domains with Orthogonal-Space Variational Gradient Descent

Authors: Ruqi Zhang, Qiang Liu, Xin T. Tong

Abstract: Sampling methods, as important inference and learning techniques, are typically designed for unconstrained domains. However, constraints are ubiquitous in machine learning problems, such as those on safety, fairness, robustness, and many other properties that must be satisfied to apply sampling results in real-life applications. Enforcing these constraints often leads to implicitly-defined manifol… ▽ More Sampling methods, as important inference and learning techniques, are typically designed for unconstrained domains. However, constraints are ubiquitous in machine learning problems, such as those on safety, fairness, robustness, and many other properties that must be satisfied to apply sampling results in real-life applications. Enforcing these constraints often leads to implicitly-defined manifolds, making efficient sampling with constraints very challenging. In this paper, we propose a new variational framework with a designed orthogonal-space gradient flow (O-Gradient) for sampling on a manifold $\mathcal{G}_0$ defined by general equality constraints. O-Gradient decomposes the gradient into two parts: one decreases the distance to $\mathcal{G}_0$ and the other decreases the KL divergence in the orthogonal space. While most existing manifold sampling methods require initialization on $\mathcal{G}_0$, O-Gradient does not require such prior knowledge. We prove that O-Gradient converges to the target constrained distribution with rate $\widetilde{O}(1/\text{the number of iterations})$ under mild conditions. Our proof relies on a new Stein characterization of conditional measure which could be of independent interest. We implement O-Gradient through both Langevin dynamics and Stein variational gradient descent and demonstrate its effectiveness in various experiments, including Bayesian deep neural networks. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022

arXiv:2207.01208 [pdf, other]

Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation

Authors: Sixing Yan, William K. Cheung, Keith Chiu, Terence M. Tong, Charles K. Cheung, Simon See

Abstract: Automatic generation of medical reports from X-ray images can assist radiologists to perform the time-consuming and yet important reporting task. Yet, achieving clinically accurate generated reports remains challenging. Modeling the underlying abnormalities using the knowledge graph approach has been found promising in enhancing the clinical accuracy. In this paper, we introduce a novel fined-grai… ▽ More Automatic generation of medical reports from X-ray images can assist radiologists to perform the time-consuming and yet important reporting task. Yet, achieving clinically accurate generated reports remains challenging. Modeling the underlying abnormalities using the knowledge graph approach has been found promising in enhancing the clinical accuracy. In this paper, we introduce a novel fined-grained knowledge graph structure called an attributed abnormality graph (ATAG). The ATAG consists of interconnected abnormality nodes and attribute nodes, allowing it to better capture the abnormality details. In contrast to the existing methods where the abnormality graph was constructed manually, we propose a methodology to automatically construct the fine-grained graph structure based on annotations, medical reports in X-ray datasets, and the RadLex radiology lexicon. We then learn the ATAG embedding using a deep model with an encoder-decoder architecture for the report generation. In particular, graph attention networks are explored to encode the relationships among the abnormalities and their attributes. A gating mechanism is adopted and integrated with various decoders for the generation. We carry out extensive experiments based on the benchmark datasets, and show that the proposed ATAG-based deep model outperforms the SOTA methods by a large margin and can improve the clinical accuracy of the generated reports. △ Less

Submitted 5 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Comments: 14 pages, 7 figures

arXiv:2206.09109 [pdf, other]

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

Authors: Harry Dong, Tian Tong, Cong Ma, Yuejie Chi

Abstract: An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robu… ▽ More An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robust to corruptions and ill-conditioning. This paper tackles tensor robust principal component analysis (RPCA), which aims to recover a low-rank tensor from its observations contaminated by sparse corruptions, under the Tucker decomposition. To minimize the computation and memory footprints, we propose to directly recover the low-dimensional tensor factors -- starting from a tailored spectral initialization -- via scaled gradient descent (ScaledGD), coupled with an iteration-varying thresholding operation to adaptively remove the impact of corruptions. Theoretically, we establish that the proposed algorithm converges linearly to the true low-rank tensor at a constant rate that is independent with its condition number, as long as the level of corruptions is not too large. Empirically, we demonstrate that the proposed algorithm achieves better and more scalable performance than state-of-the-art matrix and tensor RPCA algorithms through synthetic experiments and real-world applications. △ Less

Submitted 22 February, 2023; v1 submitted 18 June, 2022; originally announced June 2022.

arXiv:2205.08098 [pdf, other]

Can We Do Better Than Random Start? The Power of Data Outsourcing

Authors: Yi Chen, Jing Dong, Xin T. Tong

Abstract: Many organizations have access to abundant data but lack the computational power to process the data. While they can outsource the computational task to other facilities, there are various constraints on the amount of data that can be shared. It is natural to ask what can data outsourcing accomplish under such constraints. We address this question from a machine learning perspective. When training… ▽ More Many organizations have access to abundant data but lack the computational power to process the data. While they can outsource the computational task to other facilities, there are various constraints on the amount of data that can be shared. It is natural to ask what can data outsourcing accomplish under such constraints. We address this question from a machine learning perspective. When training a model with optimization algorithms, the quality of the results often relies heavily on the points where the algorithms are initialized. Random start is one of the most popular methods to tackle this issue, but it can be computationally expensive and not feasible for organizations lacking computing resources. Based on three different scenarios, we propose simulation-based algorithms that can utilize a small amount of outsourced data to find good initial points accordingly. Under suitable regularity conditions, we provide theoretical guarantees showing the algorithms can find good initial points with high probability. We also conduct numerical experiments to demonstrate that our algorithms perform significantly better than the random start approach. △ Less

Submitted 17 May, 2022; originally announced May 2022.

Comments: 22 pages, 5 figures

arXiv:2202.02850 [pdf, ps, other]

Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning

Authors: Jing Dong, Xin T. Tong

Abstract: In reinforcement learning (RL), offline learning decoupled learning from data collection and is useful in dealing with exploration-exploitation tradeoff and enables data reuse in many applications. In this work, we study two offline learning tasks: policy evaluation and policy learning. For policy evaluation, we formulate it as a stochastic optimization problem and show that it can be solved using… ▽ More In reinforcement learning (RL), offline learning decoupled learning from data collection and is useful in dealing with exploration-exploitation tradeoff and enables data reuse in many applications. In this work, we study two offline learning tasks: policy evaluation and policy learning. For policy evaluation, we formulate it as a stochastic optimization problem and show that it can be solved using approximate stochastic gradient descent (aSGD) with time-dependent data. We show aSGD achieves $\tilde O(1/t)$ convergence when the loss function is strongly convex and the rate is independent of the discount factor $γ$. This result can be extended to include algorithms making approximately contractive iterations such as TD(0). The policy evaluation algorithm is then combined with the policy iteration algorithm to learn the optimal policy. To achieve an $ε$ accuracy, the complexity of the algorithm is $\tilde O(ε^{-2}(1-γ)^{-5})$, which matches the complexity bound for classic online RL algorithms such as Q-learning. △ Less

Submitted 6 February, 2022; originally announced February 2022.

arXiv:2105.01968 [pdf]

Comparing Field Trips, VR Experiences and Video Representations on Spatial Layout Learning in Complex Buildings

Authors: Cetin Tuker, Togan Tong

Abstract: This study aimed to compare and investigate the efficacy of the real-world experiences, immersive virtual reality (IVR) experiences, and video walkthrough representations on layout-learning in a complex building. A quasi-experimental, intervention, and delayed post-test research design was used among three groups: real-world, IVR, and video walkthrough representation. A total of 41 first-year desi… ▽ More This study aimed to compare and investigate the efficacy of the real-world experiences, immersive virtual reality (IVR) experiences, and video walkthrough representations on layout-learning in a complex building. A quasi-experimental, intervention, and delayed post-test research design was used among three groups: real-world, IVR, and video walkthrough representation. A total of 41 first-year design students from architecture, and game design departments were attended the study. Design students were selected as they already know how to communicate graphically the layout of a building on paper. IVR and video walkthrough groups experienced the representations of the building by themselves, but real-world group experienced the building within a group as imitating a design education field trip. After 10 days, a total of 26 participants out of 41 tested for spatial recall performance. Recall performances were measured as an indicator of layout learning by analysing the plan sketches drawn by the participants. Results showed IVR group recalled significantly more spatial memories compared with the real-world and video walkthrough representation groups. Real-world group performed the worst among three groups. This result was interpreted to three conclusions. First, the layout learning tends to be overly sensitive to distractions and lower levels of distraction may lead to better levels of layout learning. Second, for the domain of layout knowledge learning, a remarkably simple IVR model is enough. Third, although real-world experiences give direct and richer sensory information about the spaces, layout knowledge acquisition from the surrounding environment requires psychologically active wayfinding decisions. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2104.14526 [pdf, ps, other]

Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements

Authors: Tian Tong, Cong Ma, Ashley Prater-Bennette, Erin Tripp, Yuejie Chi

Abstract: Tensors, which provide a powerful and flexible model for representing multi-attribute data and multi-way interactions, play an indispensable role in modern data science across various fields in science and engineering. A fundamental task is to faithfully recover the tensor from highly incomplete measurements in a statistically and computationally efficient manner. Harnessing the low-rank structure… ▽ More Tensors, which provide a powerful and flexible model for representing multi-attribute data and multi-way interactions, play an indispensable role in modern data science across various fields in science and engineering. A fundamental task is to faithfully recover the tensor from highly incomplete measurements in a statistically and computationally efficient manner. Harnessing the low-rank structure of tensors in the Tucker decomposition, this paper develops a scaled gradient descent (ScaledGD) algorithm to directly recover the tensor factors with tailored spectral initializations, and shows that it provably converges at a linear rate independent of the condition number of the ground truth tensor for two canonical problems -- tensor completion and tensor regression -- as soon as the sample size is above the order of $n^{3/2}$ ignoring other parameter dependencies, where $n$ is the dimension of the tensor. This leads to an extremely scalable approach to low-rank tensor estimation compared with prior art, which suffers from at least one of the following drawbacks: extreme sensitivity to ill-conditioning, high per-iteration costs in terms of memory and computation, or poor sample complexity guarantees. To the best of our knowledge, ScaledGD is the first algorithm that achieves near-optimal statistical and computational complexities simultaneously for low-rank tensor completion with the Tucker decomposition. Our algorithm highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the underlying symmetry in low-rank tensor factorization. △ Less

Submitted 21 June, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: Accepted to Journal of Machine Learning Research

arXiv:2010.13364 [pdf, ps, other]

doi 10.1109/TSP.2021.3071560

Low-Rank Matrix Recovery with Scaled Subgradient Methods: Fast and Robust Convergence Without the Condition Number

Authors: Tian Tong, Cong Ma, Yuejie Chi

Abstract: Many problems in data science can be treated as estimating a low-rank matrix from highly incomplete, sometimes even corrupted, observations. One popular approach is to resort to matrix factorization, where the low-rank matrix factors are optimized via first-order methods over a smooth loss function, such as the residual sum of squares. While tremendous progresses have been made in recent years, th… ▽ More Many problems in data science can be treated as estimating a low-rank matrix from highly incomplete, sometimes even corrupted, observations. One popular approach is to resort to matrix factorization, where the low-rank matrix factors are optimized via first-order methods over a smooth loss function, such as the residual sum of squares. While tremendous progresses have been made in recent years, the natural smooth formulation suffers from two sources of ill-conditioning, where the iteration complexity of gradient descent scales poorly both with the dimension as well as the condition number of the low-rank matrix. Moreover, the smooth formulation is not robust to corruptions. In this paper, we propose scaled subgradient methods to minimize a family of nonsmooth and nonconvex formulations -- in particular, the residual sum of absolute errors -- which is guaranteed to converge at a fast rate that is almost dimension-free and independent of the condition number, even in the presence of corruptions. We illustrate the effectiveness of our approach when the observation operator satisfies certain mixed-norm restricted isometry properties, and derive state-of-the-art performance guarantees for a variety of problems such as robust low-rank matrix sensing and quadratic sampling. △ Less

Submitted 22 April, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: Accepted to IEEE Transaction on Signal Processing

arXiv:2009.12204 [pdf, other]

doi 10.5220/0009816703020309

Evasive Windows Malware: Impact on Antiviruses and Possible Countermeasures

Authors: Cédric Herzog, Valérie Viet Triem Tong, Pierre Wilke, Arnaud van Straaten, Jean-Louis Lanet

Abstract: The perpetual opposition between antiviruses and malware leads both parties to evolve continuously. On the one hand, antiviruses put in place solutions that are more and more sophisticated and propose more complex detection techniques in addition to the classic signature analysis. This sophistication leads antiviruses to leave more traces of their presence on the machine they protect. To remain un… ▽ More The perpetual opposition between antiviruses and malware leads both parties to evolve continuously. On the one hand, antiviruses put in place solutions that are more and more sophisticated and propose more complex detection techniques in addition to the classic signature analysis. This sophistication leads antiviruses to leave more traces of their presence on the machine they protect. To remain undetected as long as possible, malware can avoid executing within such environments by hunting down the modifications left by the antiviruses. This paper aims at determining the possibilities for malware to detect the antiviruses and then evaluating the efficiency of these techniques on a panel of antiviruses that are the most used nowadays. We then collect samples showing this kind of behavior and propose to evaluate a countermeasure that creates false artifacts, thus forcing malware to evade. △ Less

Submitted 25 September, 2020; originally announced September 2020.

Journal ref: 17th International Conference on Security and Cryptography, Jul 2020, Lieusaint - Paris, France. pp.302-309

arXiv:2005.08898 [pdf, ps, other]

Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent

Authors: Tian Tong, Cong Ma, Yuejie Chi

Abstract: Low-rank matrix estimation is a canonical problem that finds numerous applications in signal processing, machine learning and imaging science. A popular approach in practice is to factorize the matrix into two compact low-rank factors, and then optimize these factors directly via simple iterative methods such as gradient descent and alternating minimization. Despite nonconvexity, recent literature… ▽ More Low-rank matrix estimation is a canonical problem that finds numerous applications in signal processing, machine learning and imaging science. A popular approach in practice is to factorize the matrix into two compact low-rank factors, and then optimize these factors directly via simple iterative methods such as gradient descent and alternating minimization. Despite nonconvexity, recent literatures have shown that these simple heuristics in fact achieve linear convergence when initialized properly for a growing number of problems of interest. However, upon closer examination, existing approaches can still be computationally expensive especially for ill-conditioned matrices: the convergence rate of gradient descent depends linearly on the condition number of the low-rank matrix, while the per-iteration cost of alternating minimization is often prohibitive for large matrices. The goal of this paper is to set forth a competitive algorithmic approach dubbed Scaled Gradient Descent (ScaledGD) which can be viewed as pre-conditioned or diagonally-scaled gradient descent, where the pre-conditioners are adaptive and iteration-varying with a minimal computational overhead. With tailored variants for low-rank matrix sensing, robust principal component analysis and matrix completion, we theoretically show that ScaledGD achieves the best of both worlds: it converges linearly at a rate independent of the condition number of the low-rank matrix similar as alternating minimization, while maintaining the low per-iteration cost of gradient descent. Our analysis is also applicable to general loss functions that are restricted strongly convex and smooth over low-rank matrices. To the best of our knowledge, ScaledGD is the first algorithm that provably has such properties over a wide range of low-rank matrix estimation tasks. △ Less

Submitted 14 June, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: Accepted to Journal of Machine Learning Research

arXiv:2003.13268 [pdf, other]

Architecture Disentanglement for Deep Neural Networks

Authors: Jie Hu, Liujuan Cao, Qixiang Ye, Tong Tong, ShengChuan Zhang, Ke Li, Feiyue Huang, Rongrong Ji, Ling Shao

Abstract: Understanding the inner workings of deep neural networks (DNNs) is essential to provide trustworthy artificial intelligence techniques for practical applications. Existing studies typically involve linking semantic concepts to units or layers of DNNs, but fail to explain the inference process. In this paper, we introduce neural architecture disentanglement (NAD) to fill the gap. Specifically, NAD… ▽ More Understanding the inner workings of deep neural networks (DNNs) is essential to provide trustworthy artificial intelligence techniques for practical applications. Existing studies typically involve linking semantic concepts to units or layers of DNNs, but fail to explain the inference process. In this paper, we introduce neural architecture disentanglement (NAD) to fill the gap. Specifically, NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. We investigate whether, where, and how the disentanglement occurs through experiments conducted with handcrafted and automatically-searched network architectures, on both object-based and scene-based datasets. Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs. First, DNNs can be divided into sub-architectures for independent tasks. Second, deeper layers do not always correspond to higher semantics. Third, the connection type in a DNN affects how the information flows across layers, leading to different disentanglement behaviors. With NAD, we further explain why DNNs sometimes give wrong predictions. Experimental results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones. Code will be available at: https://github.com/hujiecpp/NAD. △ Less

Submitted 23 March, 2021; v1 submitted 30 March, 2020; originally announced March 2020.

arXiv:2003.11196 [pdf, ps, other]

Dimension Independent Generalization Error by Stochastic Gradient Descent

Authors: Xi Chen, Qiang Liu, Xin T. Tong

Abstract: One classical canon of statistics is that large models are prone to overfitting, and model selection procedures are necessary for high dimensional data. However, many overparameterized models, such as neural networks, perform very well in practice, although they are often trained with simple online methods and regularization. The empirical success of overparameterized models, which is often known… ▽ More One classical canon of statistics is that large models are prone to overfitting, and model selection procedures are necessary for high dimensional data. However, many overparameterized models, such as neural networks, perform very well in practice, although they are often trained with simple online methods and regularization. The empirical success of overparameterized models, which is often known as benign overfitting, motivates us to have a new look at the statistical generalization theory for online optimization. In particular, we present a general theory on the generalization error of stochastic gradient descent (SGD) solutions for both convex and locally convex loss functions. We further discuss data and model conditions that lead to a ``low effective dimension". Under these conditions, we show that the generalization error either does not depend on the ambient dimension $p$ or depends on $p$ via a poly-logarithmic factor. We also demonstrate that in several widely used statistical models, the ``low effective dimension'' arises naturally in overparameterized settings. The studied statistical applications include both convex models such as linear regression and logistic regression and non-convex models such as $M$-estimator and two-layer neural networks. △ Less

Submitted 4 January, 2021; v1 submitted 24 March, 2020; originally announced March 2020.

Comments: 60 pages, 2 figures

arXiv:1910.10330 [pdf, ps, other]

doi 10.1007/978-3-030-33843-5_15

Stain Style Transfer using Transitive Adversarial Networks

Authors: Shaojin Cai, Yuyang Xue3 Qinquan Gao, Min Du, Gang Chen, Hejun Zhang, Tong Tong

Abstract: Digitized pathological diagnosis has been in increasing demand recently. It is well known that color information is critical to the automatic and visual analysis of pathological slides. However, the color variations due to various factors not only have negative impact on pathologist's diagnosis, but also will reduce the robustness of the algorithms. The factors that cause the color differences are… ▽ More Digitized pathological diagnosis has been in increasing demand recently. It is well known that color information is critical to the automatic and visual analysis of pathological slides. However, the color variations due to various factors not only have negative impact on pathologist's diagnosis, but also will reduce the robustness of the algorithms. The factors that cause the color differences are not only in the process of making the slices, but also in the process of digitization. Different strategies have been proposed to alleviate the color variations. Most of such techniques rely on collecting color statistics to perform color matching across images and highly dependent on a reference template slide. Since the pathological slides between hospitals are usually unpaired, these methods do not yield good matching results. In this work, we propose a novel network that we refer to as Transitive Adversarial Networks (TAN) to transfer the color information among slides from different hospitals or centers. It is not necessary for an expert to pick a representative reference slide in the proposed TAN method. We compare the proposed method with the state-of-the-art methods quantitatively and qualitatively. Compared with the state-of-the-art methods, our method yields an improvement of 0.87dB in terms of PSNR, demonstrating the effectiveness of the proposed TAN method in stain style transfer. △ Less

Submitted 22 October, 2019; originally announced October 2019.

Comments: MICCAI 2019 MLMIR Workshop, Oral Paper

arXiv:1909.04142 [pdf, other]

DaTscan SPECT Image Classification for Parkinson's Disease

Authors: Justin Quan, Lin Xu, Rene Xu, Tyrael Tong, Jean Su

Abstract: Parkinson's Disease (PD) is a neurodegenerative disease that currently does not have a cure. In order to facilitate disease management and reduce the speed of symptom progression, early diagnosis is essential. The current clinical, diagnostic approach is to have radiologists perform human visual analysis of the degeneration of dopaminergic neurons in the substantia nigra region of the brain. Clini… ▽ More Parkinson's Disease (PD) is a neurodegenerative disease that currently does not have a cure. In order to facilitate disease management and reduce the speed of symptom progression, early diagnosis is essential. The current clinical, diagnostic approach is to have radiologists perform human visual analysis of the degeneration of dopaminergic neurons in the substantia nigra region of the brain. Clinically, dopamine levels are monitored through observing dopamine transporter (DaT) activity. One method of DaT activity analysis is performed with the injection of an Iodine-123 fluoropropyl (123I-FP-CIT) tracer combined with single photon emission computerized tomography (SPECT) imaging. The tracer illustrates the region of interest in the resulting DaTscan SPECT images. Human visual analysis is slow and vulnerable to subjectivity between radiologists, so the goal was to develop an introductory implementation of a deep convolutional neural network that can objectively and accurately classify DaTscan SPECT images as Parkinson's Disease or normal. This study illustrates the approach of using a deep convolutional neural network and evaluates its performance on DaTscan SPECT image classification. The data used in this study was obtained through a database provided by the Parkinson's Progression Markers Initiative (PPMI). The deep neural network in this study utilizes the InceptionV3 architecture, 1st runner up in the 2015 ImageNet Large Scale Visual Recognition Competition (ILSVRC), as a base model. A custom, binary classifier block was added on top of this base. In order to account for the small dataset size, a ten fold cross validation was implemented to evaluate the model's performance. △ Less

Submitted 9 September, 2019; originally announced September 2019.

arXiv:1904.13016 [pdf, ps, other]

On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics

Authors: Xi Chen, Simon S. Du, Xin T. Tong

Abstract: Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm in stochastic optimization. Recent work by Zhang et al. [2017] presents an analysis for the hitting time of SGLD for the first and second order stationary points. The proof in Zhang et al. [2017] is a two-stage procedure through bounding the Cheeger's constant, which is rather complicated and leads to loose bounds. In this pap… ▽ More Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm in stochastic optimization. Recent work by Zhang et al. [2017] presents an analysis for the hitting time of SGLD for the first and second order stationary points. The proof in Zhang et al. [2017] is a two-stage procedure through bounding the Cheeger's constant, which is rather complicated and leads to loose bounds. In this paper, using intuitions from stochastic differential equations, we provide a direct analysis for the hitting times of SGLD to the first and second order stationary points. Our analysis is straightforward. It only relies on basic linear algebra and probability theory tools. Our direct analysis also leads to tighter bounds comparing to Zhang et al. [2017] and shows the explicit dependence of the hitting time on different factors, including dimensionality, smoothness, noise strength, and step size effects. Under suitable conditions, we show that the hitting time of SGLD to first-order stationary points can be dimension-independent. Moreover, we apply our analysis to study several important online estimation problems in machine learning, including linear regression, matrix factorization, and online PCA. △ Less

Submitted 15 March, 2020; v1 submitted 29 April, 2019; originally announced April 2019.

Comments: 41 pages

arXiv:1902.09904 [pdf]

doi 10.3389/fnins.2019.00509

Diagnosis of Alzheimer's Disease via Multi-modality 3D Convolutional Neural Network

Authors: Yechong Huang, Jiahang Xu, Yuncheng Zhou, Tong Tong, Xiahai Zhuang, the Alzheimer's Disease Neuroimaging Initiative

Abstract: Alzheimer's Disease (AD) is one of the most concerned neurodegenerative diseases. In the last decade, studies on AD diagnosis attached great significance to artificial intelligence (AI)-based diagnostic algorithms. Among the diverse modality imaging data, T1-weighted MRI and 18F-FDGPET are widely researched for this task. In this paper, we propose a novel convolutional neural network (CNN) to fuse… ▽ More Alzheimer's Disease (AD) is one of the most concerned neurodegenerative diseases. In the last decade, studies on AD diagnosis attached great significance to artificial intelligence (AI)-based diagnostic algorithms. Among the diverse modality imaging data, T1-weighted MRI and 18F-FDGPET are widely researched for this task. In this paper, we propose a novel convolutional neural network (CNN) to fuse the multi-modality information including T1-MRI and FDG-PDT images around the hippocampal area for the diagnosis of AD. Different from the traditional machine learning algorithms, this method does not require manually extracted features, and utilizes the stateof-art 3D image-processing CNNs to learn features for the diagnosis and prognosis of AD. To validate the performance of the proposed network, we trained the classifier with paired T1-MRI and FDG-PET images using the ADNI datasets, including 731 Normal (NL) subjects, 647 AD subjects, 441 stable MCI (sMCI) subjects and 326 progressive MCI (pMCI) subjects. We obtained the maximal accuracies of 90.10% for NL/AD task, 87.46% for NL/pMCI task, and 76.90% for sMCI/pMCI task. The proposed framework yields comparative results against state-of-the-art approaches. Moreover, the experimental results have demonstrated that (1) segmentation is not a prerequisite by using CNN, (2) the hippocampal area provides enough information to give a reference to AD diagnosis. Keywords: Alzheimer's Disease, Multi-modality, Image Classification, CNN, Deep Learning, Hippocampal △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: 21 pages, 5 figures, 9 tables

Journal ref: https://www.frontiersin.org/articles/10.3389/fnins.2019.00509/full

arXiv:1902.04244 [pdf, other]

Enhancement Mask for Hippocampus Detection and Segmentation

Authors: Dengsheng Chen, Wenxi Liu, You Huang, Tong Tong, Yuanlong Yu

Abstract: Detection and segmentation of the hippocampal structures in volumetric brain images is a challenging problem in the area of medical imaging. In this paper, we propose a two-stage 3D fully convolutional neural network that efficiently detects and segments the hippocampal structures. In particular, our approach first localizes the hippocampus from the whole volumetric image while obtaining a proposa… ▽ More Detection and segmentation of the hippocampal structures in volumetric brain images is a challenging problem in the area of medical imaging. In this paper, we propose a two-stage 3D fully convolutional neural network that efficiently detects and segments the hippocampal structures. In particular, our approach first localizes the hippocampus from the whole volumetric image while obtaining a proposal for a rough segmentation. After localization, we apply the proposal as an enhancement mask to extract the fine structure of the hippocampus. The proposed method has been evaluated on a public dataset and compares with state-of-the-art approaches. Results indicate the effectiveness of the proposed method, which yields mean Dice Similarity Coefficients (i.e. DSC) of $0.897$ and $0.900$ for the left and right hippocampus, respectively. Furthermore, extensive experiments manifest that the proposed enhancement mask layer has remarkable benefits for accelerating training process and obtaining more accurate segmentation results. △ Less

Submitted 12 February, 2019; originally announced February 2019.

Comments: This paper has been published in the proceedings of IEEE International Conference on Information and Automation 2018

Journal ref: 2018 IEEE International Conference on Information and Automation (ICIA)

arXiv:1810.01641 [pdf, other]

PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

Authors: Andrey Ignatov, Radu Timofte, Thang Van Vu, Tung Minh Luu, Trung X Pham, Cao Van Nguyen, Yongwoo Kim, Jae-Seok Choi, Munchurl Kim, Jie Huang, Jiewen Ran, Chen Xing, Xingguang Zhou, Pengfei Zhu, Mingrui Geng, Yawei Li, Eirikur Agustsson, Shuhang Gu, Luc Van Gool, Etienne de Stoutz, Nikolay Kobyshev, Kehui Nie, Yan Zhao, Gen Li, Tong Tong , et al. (23 additional authors not shown)

Abstract: This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map lo… ▽ More This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions' perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones. △ Less

Submitted 3 October, 2018; originally announced October 2018.

arXiv:1710.02882 [pdf, other]

Supermajority Sentiment Detection with External Influence in Large Social Networks

Authors: Tian Tong, Rohit Negi

Abstract: In a large social network whose members harbor binary sentiments towards an issue, we investigate the asymptotic accuracy of sentiment detection. We model the user sentiments by an Ising Markov random field model and allow the user sentiments to be biased by an external influence. We consider a general supermajority sentiment detection problem and show that the detection accuracy is affected by th… ▽ More In a large social network whose members harbor binary sentiments towards an issue, we investigate the asymptotic accuracy of sentiment detection. We model the user sentiments by an Ising Markov random field model and allow the user sentiments to be biased by an external influence. We consider a general supermajority sentiment detection problem and show that the detection accuracy is affected by the network structure, its parameters, as well as the external influence level. △ Less

Submitted 8 October, 2017; originally announced October 2017.

Comments: 8 pages, 6 figures, 55th Annual Allerton Conference on Communication

arXiv:1611.04654 [pdf, other]

Asymptotic Performance Analysis of Majority Sentiment Detection in Online Social Networks

Authors: Tian Tong, Rohit Negi

Abstract: We analyze the problem of majority sentiment detection in Online Social Networks (OSN), and relate the detection error probability to the underlying graph of the OSN. Modeling the underlying social network as an Ising Markov random field prior based on a given graph, we show that in the case of the empty graph (independent sentiments) and the chain graph, the detection is always inaccurate, even w… ▽ More We analyze the problem of majority sentiment detection in Online Social Networks (OSN), and relate the detection error probability to the underlying graph of the OSN. Modeling the underlying social network as an Ising Markov random field prior based on a given graph, we show that in the case of the empty graph (independent sentiments) and the chain graph, the detection is always inaccurate, even when the number of users grow to infinity. In the case of the complete graph, the detection is inaccurate if the connection strength is below a certain critical value, while it is asymptotically accurate if the strength is above that critical value, which is analogous to the phase transition phenomenon in statistical physics. △ Less

Submitted 14 November, 2016; originally announced November 2016.

arXiv:1605.00029 [pdf, other]

Multi-Atlas Segmentation using Partially Annotated Data: Methods and Annotation Strategies

Authors: Lisa M. Koch, Martin Rajchl, Wenjia Bai, Christian F. Baumgartner, Tong Tong, Jonathan Passerat-Palmbach, Paul Aljabar, Daniel Rueckert

Abstract: Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the wor… ▽ More Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation. △ Less

Submitted 29 April, 2016; originally announced May 2016.

Showing 1–38 of 38 results for author: Tong, T