subscribe to arXiv mailings

GPC: Generative and General Pathology Image Classifier

Abstract: Deep learning has been increasingly incorporated into various computational pathology applications to improve its efficiency, accuracy, and robustness. Although successful, most previous approaches for image classification have crucial drawbacks. There exist numerous tasks in pathology, but one needs to build a model per task, i.e., a task-specific model, thereby increasing the number of models, t… ▽ More Deep learning has been increasingly incorporated into various computational pathology applications to improve its efficiency, accuracy, and robustness. Although successful, most previous approaches for image classification have crucial drawbacks. There exist numerous tasks in pathology, but one needs to build a model per task, i.e., a task-specific model, thereby increasing the number of models, training resources, and cost. Moreover, transferring arbitrary task-specific model to another task is still a challenging problem. Herein, we propose a task-agnostic generative and general pathology image classifier, so called GPC, that aims at learning from diverse kinds of pathology images and conducting numerous classification tasks in a unified model. GPC, equipped with a convolutional neural network and a Transformer-based language model, maps pathology images into a high-dimensional feature space and generates pertinent class labels as texts via the image-to-text classification mechanism. We evaluate GPC on six datasets for four different pathology image classification tasks. Experimental results show that GPC holds considerable potential for developing an effective and efficient universal model for pathology image analysis. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: MICCAI-MedAGI 2023 (Best Paper Honorable Mention)

arXiv:2407.09030 [pdf, other]

CAMP: Continuous and Adaptive Learning Model in Pathology

Authors: Anh Tien Nguyen, Keunho Byeon, Kyungeun Kim, Boram Song, Seoung Wan Chae, Jin Tae Kwak

Abstract: There exist numerous diagnostic tasks in pathology. Conventional computational pathology formulates and tackles them as independent and individual image classification problems, thereby resulting in computational inefficiency and high costs. To address the challenges, we propose a generic, unified, and universal framework, called a continuous and adaptive learning model in pathology (CAMP), for pa… ▽ More There exist numerous diagnostic tasks in pathology. Conventional computational pathology formulates and tackles them as independent and individual image classification problems, thereby resulting in computational inefficiency and high costs. To address the challenges, we propose a generic, unified, and universal framework, called a continuous and adaptive learning model in pathology (CAMP), for pathology image classification. CAMP is a generative, efficient, and adaptive classification model that can continuously adapt to any classification task by leveraging pathology-specific prior knowledge and learning taskspecific knowledge with minimal computational cost and without forgetting the knowledge from the existing tasks. We evaluated CAMP on 22 datasets, including 1,171,526 patches and 11,811 pathology slides, across 17 classification tasks. CAMP achieves state-of-theart classification performance on a wide range of datasets and tasks at both patch- and slide-levels and reduces up to 94% of computation time and 85% of storage memory in comparison to the conventional classification models. Our results demonstrate that CAMP can offer a fundamental transformation in pathology image classification, paving the way for the fully digitized and computerized pathology practice. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Under review

arXiv:2407.07360 [pdf, other]

Towards a text-based quantitative and explainable histopathology image analysis

Authors: Anh Tien Nguyen, Trinh Thi Le Vuong, Jin Tae Kwak

Abstract: Recently, vision-language pre-trained models have emerged in computational pathology. Previous works generally focused on the alignment of image-text pairs via the contrastive pre-training paradigm. Such pre-trained models have been applied to pathology image classification in zero-shot learning or transfer learning fashion. Herein, we hypothesize that the pre-trained vision-language models can be… ▽ More Recently, vision-language pre-trained models have emerged in computational pathology. Previous works generally focused on the alignment of image-text pairs via the contrastive pre-training paradigm. Such pre-trained models have been applied to pathology image classification in zero-shot learning or transfer learning fashion. Herein, we hypothesize that the pre-trained vision-language models can be utilized for quantitative histopathology image analysis through a simple image-to-text retrieval. To this end, we propose a Text-based Quantitative and Explainable histopathology image analysis, which we call TQx. Given a set of histopathology images, we adopt a pre-trained vision-language model to retrieve a word-of-interest pool. The retrieved words are then used to quantify the histopathology images and generate understandable feature embeddings due to the direct mapping to the text description. To evaluate the proposed method, the text-based embeddings of four histopathology image datasets are utilized to perform clustering and classification tasks. The results demonstrate that TQx is able to quantify and analyze histopathology images that are comparable to the prevalent visual models in computational pathology. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: MICCAI 2024 - Early acceptance (Top 11%)

arXiv:2407.06581 [pdf, other]

Vision language models are blind

Authors: Pooyan Rahmanzadehgervi, Logan Bolton, Mohammad Reza Taesiri, Anh Totti Nguyen

Abstract: Large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini 1.5 Pro are powering countless image-text applications and scoring high on many vision-understanding benchmarks. We propose BlindTest, a suite of 7 visual tasks absurdly easy to humans such as identifying (a) whether two circles overlap; (b) whether two lines intersect; (c) which letter is being circled in a word; and (… ▽ More Large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini 1.5 Pro are powering countless image-text applications and scoring high on many vision-understanding benchmarks. We propose BlindTest, a suite of 7 visual tasks absurdly easy to humans such as identifying (a) whether two circles overlap; (b) whether two lines intersect; (c) which letter is being circled in a word; and (d) counting the number of circles in a Olympic-like logo. Surprisingly, four state-of-the-art VLMs are, on average, only 56.20% accurate on our benchmark, with \newsonnet being the best (73.77% accuracy). On BlindTest, VLMs struggle with tasks that requires precise spatial information and counting (from 0 to 10), sometimes providing an impression of a person with myopia seeing fine details as blurry and making educated guesses. Code is available at: https://vlmsareblind.github.io/ △ Less

Submitted 12 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

arXiv:2403.05297 [pdf, other]

PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

Authors: Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Totti Nguyen

Abstract: CLIP-based classifiers rely on the prompt containing a {class name} that is known to the text encoder. Therefore, they perform poorly on new classes or the classes whose names rarely appear on the Internet (e.g., scientific names of birds). For fine-grained classification, we propose PEEB - an explainable and editable classifier to (1) express the class name into a set of text descriptors that des… ▽ More CLIP-based classifiers rely on the prompt containing a {class name} that is known to the text encoder. Therefore, they perform poorly on new classes or the classes whose names rarely appear on the Internet (e.g., scientific names of birds). For fine-grained classification, we propose PEEB - an explainable and editable classifier to (1) express the class name into a set of text descriptors that describe the visual parts of that class; and (2) match the embeddings of the detected parts to their textual descriptors in each class to compute a logit score for classification. In a zero-shot setting where the class names are unknown, PEEB outperforms CLIP by a huge margin (~10x in top-1 accuracy). Compared to part-based classifiers, PEEB is not only the state-of-the-art (SOTA) on the supervised-learning setting (88.80% and 92.20% accuracy on CUB-200 and Dogs-120, respectively) but also the first to enable users to edit the text descriptors to form a new classifier without any re-training. Compared to concept bottleneck models, PEEB is also the SOTA in both zero-shot and supervised-learning settings. △ Less

Submitted 12 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: Findings of NAACL 2024 (long paper)

arXiv:2311.06894 [pdf, other]

doi 10.1109/RIVF55975.2022.10013894

An Application of Vector Autoregressive Model for Analyzing the Impact of Weather And Nearby Traffic Flow On The Traffic Volume

Authors: Anh Thi-Hoang Nguyen, Dung Ha Nguyen, Trong-Hop Do

Abstract: This paper aims to predict the traffic flow at one road segment based on nearby traffic volume and weather conditions. Our team also discover the impact of weather conditions and nearby traffic volume on the traffic flow at a target point. The analysis results will help solve the problem of traffic flow prediction and develop an optimal transport network with efficient traffic movement and minimal… ▽ More This paper aims to predict the traffic flow at one road segment based on nearby traffic volume and weather conditions. Our team also discover the impact of weather conditions and nearby traffic volume on the traffic flow at a target point. The analysis results will help solve the problem of traffic flow prediction and develop an optimal transport network with efficient traffic movement and minimal traffic congestion. Hourly historical weather and traffic flow data are selected to solve this problem. This paper uses model VAR(36) with time trend and constant to train the dataset and forecast. With an RMSE of 565.0768111 on average, the model is considered appropriate although some statistical tests implies that the residuals are unstable and non-normal. Also, this paper points out some variables that are not useful in forecasting, which helps simplify the data-collecting process when building the forecasting system. △ Less

Submitted 12 November, 2023; originally announced November 2023.

Comments: International Conference on Computing and Communication Technologies (RIVF2022)

Report number: D1-2022-48

arXiv:2311.06851 [pdf, other]

Automatic Textual Normalization for Hate Speech Detection

Authors: Anh Thi-Hoang Nguyen, Dung Ha Nguyen, Nguyet Thi Nguyen, Khanh Thanh-Duy Ho, Kiet Van Nguyen

Abstract: Social media data is a valuable resource for research, yet it contains a wide range of non-standard words (NSW). These irregularities hinder the effective operation of NLP tools. Current state-of-the-art methods for the Vietnamese language address this issue as a problem of lexical normalization, involving the creation of manual rules or the implementation of multi-staged deep learning frameworks,… ▽ More Social media data is a valuable resource for research, yet it contains a wide range of non-standard words (NSW). These irregularities hinder the effective operation of NLP tools. Current state-of-the-art methods for the Vietnamese language address this issue as a problem of lexical normalization, involving the creation of manual rules or the implementation of multi-staged deep learning frameworks, which necessitate extensive efforts to craft intricate rules. In contrast, our approach is straightforward, employing solely a sequence-to-sequence (Seq2Seq) model. In this research, we provide a dataset for textual normalization, comprising 2,181 human-annotated comments with an inter-annotator agreement of 0.9014. By leveraging the Seq2Seq model for textual normalization, our results reveal that the accuracy achieved falls slightly short of 70%. Nevertheless, textual normalization enhances the accuracy of the Hate Speech Detection (HSD) task by approximately 2%, demonstrating its potential to improve the performance of complex NLP tasks. Our dataset is accessible for research purposes. △ Less

Submitted 4 December, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: Accepted to present at 2023 International Conference on Intelligent Systems Design and Applications (ISDA2023)

arXiv:2311.02803 [pdf, other]

Fast and Interpretable Face Identification for Out-Of-Distribution Data Using Vision Transformers

Authors: Hai Phan, Cindy Le, Vu Le, Yihui He, Anh Totti Nguyen

Abstract: Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this technique can be subject to occlusion (e.g. faces with masks or sunglasses) and out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches state-of-the-art accuracy on out-of-distribution data by first comparing two images at the image level, and then at the patc… ▽ More Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this technique can be subject to occlusion (e.g. faces with masks or sunglasses) and out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches state-of-the-art accuracy on out-of-distribution data by first comparing two images at the image level, and then at the patch level. Yet, its later patch-wise re-ranking stage admits a large $O(n^3 \log n)$ time complexity (for $n$ patches in an image) due to the optimal transport optimization. In this paper, we propose a novel, 2-image Vision Transformers (ViTs) that compares two images at the patch level using cross-attention. After training on 2M pairs of images on CASIA Webface (Yi et al. 2014), our model performs at a comparable accuracy as DeepFace-EMD on out-of-distribution data, yet at an inference speed more than twice as fast as DeepFace-EMD (Phan et al. 2022). In addition, via a human study, our model shows promising explainability through the visualization of cross-attention. We believe our work can inspire more explorations in using ViTs for face identification. △ Less

Submitted 5 November, 2023; originally announced November 2023.

Comments: 20 pages, 15 Figures

arXiv:2310.07984 [pdf]

Large Language Models for Scientific Synthesis, Inference and Explanation

Authors: Yizhen Zheng, Huan Yee Koh, Jiaxin Ju, Anh T. N. Nguyen, Lauren T. May, Geoffrey I. Webb, Shirui Pan

Abstract: Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code gen… ▽ More Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code generation. However, they have yet to demonstrate advanced applications in natural science. Here we show how large language models can perform scientific synthesis, inference, and explanation. We present a method for using general-purpose large language models to make inferences from scientific datasets of the form usually associated with special-purpose machine learning algorithms. We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature. When a conventional machine learning system is augmented with this synthesized and inferred knowledge it can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. This approach has the further advantage that the large language model can explain the machine learning system's predictions. We anticipate that our framework will open new avenues for AI to accelerate the pace of scientific discovery. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Supplementary Information: https://drive.google.com/file/d/1KrpUpzuFTeMx6a6zl18lqdo8vV-UUa1Z/view?usp=sharing Github Repo: https://github.com/zyzisastudyreallyhardguy/LLM4SD

arXiv:2308.13651 [pdf, other]

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

Authors: Giang Nguyen, Valerie Chen, Mohammad Reza Taesiri, Anh Totti Nguyen

Abstract: Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained classifier C. We leverage an image comparator S that (1) compares the input image with NN images fr… ▽ More Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained classifier C. We leverage an image comparator S that (1) compares the input image with NN images from the top-K most probable classes; and (2) uses S's output scores to weight the confidence scores of C. Our method consistently improves fine-grained image classification accuracy on CUB-200, Cars-196, and Dogs-120. Also, a human study finds that showing lay users our probable-class nearest neighbors (PCNN) improves their decision accuracy over prior work which only shows only the top-1 class examples. △ Less

Submitted 23 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

arXiv:2305.18458 [pdf, other]

Conditional Support Alignment for Domain Adaptation with Label Shift

Authors: Anh T Nguyen, Lam Tran, Anh Tong, Tuan-Duy H. Nguyen, Toan Tran

Abstract: Unsupervised domain adaptation (UDA) refers to a domain adaptation framework in which a learning model is trained based on the labeled samples on the source domain and unlabelled ones in the target domain. The dominant existing methods in the field that rely on the classical covariate shift assumption to learn domain-invariant feature representation have yielded suboptimal performance under the la… ▽ More Unsupervised domain adaptation (UDA) refers to a domain adaptation framework in which a learning model is trained based on the labeled samples on the source domain and unlabelled ones in the target domain. The dominant existing methods in the field that rely on the classical covariate shift assumption to learn domain-invariant feature representation have yielded suboptimal performance under the label distribution shift between source and target domains. In this paper, we propose a novel conditional adversarial support alignment (CASA) whose aim is to minimize the conditional symmetric support divergence between the source's and target domain's feature representation distributions, aiming at a more helpful representation for the classification task. We also introduce a novel theoretical target risk bound, which justifies the merits of aligning the supports of conditional feature distributions compared to the existing marginal support alignment approach in the UDA settings. We then provide a complete training process for learning in which the objective optimization functions are precisely based on the proposed target risk bound. Our empirical results demonstrate that CASA outperforms other state-of-the-art methods on different UDA benchmark tasks under label shift conditions. △ Less

Submitted 29 May, 2023; originally announced May 2023.

arXiv:2301.11799 [pdf]

Factors influencing to use of Bluezone

Authors: Vinh T. Nguyen, Anh T. Nguyen, Tan H. Nguyen, Dinh K. Luong

Abstract: This study aims to understand the main factors and their influence on the behavioral intention of users about using Bluezone. Surveys are sent to users through the Google Form tool. Experimental results through analysis of exploratory factors on 224 survey subjects show that there are 4 main factors affecting user behavior. Structural equation modeling indicates that trust, performance expectation… ▽ More This study aims to understand the main factors and their influence on the behavioral intention of users about using Bluezone. Surveys are sent to users through the Google Form tool. Experimental results through analysis of exploratory factors on 224 survey subjects show that there are 4 main factors affecting user behavior. Structural equation modeling indicates that trust, performance expectations, effort expectations, and social influence have a positive impact on behavioral intention of using Bluezone △ Less

Submitted 24 January, 2023; originally announced January 2023.

Comments: in Vietnamese language

arXiv:2209.03148 [pdf, other]

Improving Out-of-Distribution Detection via Epistemic Uncertainty Adversarial Training

Authors: Derek Everett, Andre T. Nguyen, Luke E. Richards, Edward Raff

Abstract: The quantification of uncertainty is important for the adoption of machine learning, especially to reject out-of-distribution (OOD) data back to human experts for review. Yet progress has been slow, as a balance must be struck between computational efficiency and the quality of uncertainty estimates. For this reason many use deep ensembles of neural networks or Monte Carlo dropout for reasonable u… ▽ More The quantification of uncertainty is important for the adoption of machine learning, especially to reject out-of-distribution (OOD) data back to human experts for review. Yet progress has been slow, as a balance must be struck between computational efficiency and the quality of uncertainty estimates. For this reason many use deep ensembles of neural networks or Monte Carlo dropout for reasonable uncertainty estimates at relatively minimal compute and memory. Surprisingly, when we focus on the real-world applicable constraint of $\leq 1\%$ false positive rate (FPR), prior methods fail to reliably detect OOD samples as such. Notably, even Gaussian random noise fails to trigger these popular OOD techniques. We help to alleviate this problem by devising a simple adversarial training scheme that incorporates an attack of the epistemic uncertainty predicted by the dropout ensemble. We demonstrate this method improves OOD detection performance on standard data (i.e., not adversarially crafted), and improves the standardized partial AUC from near-random guessing performance to $\geq 0.75$. △ Less

Submitted 9 September, 2022; v1 submitted 5 September, 2022; originally announced September 2022.

Comments: 8 pages, 5 figures

arXiv:2208.05433 [pdf, other]

doi 10.1371/journal.pone.0277081

Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks

Authors: Thao Nguyen, Hieu H. Pham, Huy Khiem Le, Anh Tu Nguyen, Ngoc Tien Thanh, Cuong Do

Abstract: The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19… ▽ More The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19 automatically. We propose a novel method to extract ECG signals from ECG paper records, which are then fed into a one-dimensional convolution neural network (1D-CNN) to learn and diagnose the disease. To evaluate the quality of digitized signals, R peaks in the paper-based ECG images are labeled. Afterward, RR intervals calculated from each image are compared to RR intervals of the corresponding digitized signal. Experiments on the COVID-19 ECG images dataset demonstrate that the proposed digitization method is able to capture correctly the original signals, with a mean absolute error of 28.11 ms. Our proposed 1D-CNN model, which is trained on the digitized ECG signals, allows identifying individuals with COVID-19 and other subjects accurately, with classification accuracies of 98.42%, 95.63%, and 98.50% for classifying COVID-19 vs. Normal, COVID-19 vs. Abnormal Heartbeats, and COVID-19 vs. other classes, respectively. Furthermore, the proposed method also achieves a high-level of performance for the multi-classification task. Our findings indicate that a deep learning system trained on digitized ECG signals can serve as a potential tool for diagnosing COVID-19. △ Less

Submitted 5 October, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: Accepted with minor revision by Plos One

arXiv:2206.00524 [pdf, other]

Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data

Authors: Khanh Q. Tran, An T. Nguyen, Phu Gia Hoang, Canh Duc Luu, Trong-Hop Do, Kiet Van Nguyen

Abstract: Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on developing an intelligent system capable of addres… ▽ More Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on developing an intelligent system capable of addressing these shortcomings. Firstly, we proposed an efficient pre-processing technique to clean comments collected from Vietnamese social media. Secondly, a novel hate speech detection (HSD) model, which is the combination of a pre-trained PhoBERT model and a Text-CNN model, was proposed for solving tasks in Vietnamese. Thirdly, EDA techniques are applied to deal with imbalanced data to improve the performance of classification models. Besides, various experiments were conducted as baselines to compare and investigate the proposed model's performance against state-of-the-art methods. The experiment results show that the proposed PhoBERT-CNN model outperforms SOTA methods and achieves an F1-score of 67,46% and 98,45% on two benchmark datasets, ViHSD and HSD-VLSP, respectively. Finally, we also built a streaming HSD application to demonstrate the practicality of our proposed system. △ Less

Submitted 1 June, 2022; originally announced June 2022.

arXiv:2205.01039 [pdf]

Big Tech Companies Impact on Research at the Faculty of Information Technology and Electrical Engineering

Authors: Ahmad Hassanpour, An Thi Nguyen, Anshul Rani, Sarang Shaikh, Ying Xu, Haoyu Zhang

Abstract: Artificial intelligence is gaining momentum, ongoing pandemic is fuel to that with more opportunities in every sector specially in health and education sector. But with the growth in technology, challenges associated with ethics also grow (Katharine Schwab, 2021). Whenever a new AI product is developed, companies publicize that their systems are transparent, fair, and are in accordance with the ex… ▽ More Artificial intelligence is gaining momentum, ongoing pandemic is fuel to that with more opportunities in every sector specially in health and education sector. But with the growth in technology, challenges associated with ethics also grow (Katharine Schwab, 2021). Whenever a new AI product is developed, companies publicize that their systems are transparent, fair, and are in accordance with the existing laws and regulations as the methods and procedures followed by a big tech company for ensuring AI ethics, not only affect the trust and perception of public, but it also challenges the capabilities of the companies towards business strategies in different regions, and the kind of brains it can attract for their projects. AI Big Tech companies have influence over AI ethics as many influencing ethical-AI researchers have roots in Big Tech or its associated labs. △ Less

Submitted 10 April, 2022; originally announced May 2022.

arXiv:2203.08648 [pdf, other]

Artificial Intelligence Enables Real-Time and Intuitive Control of Prostheses via Nerve Interface

Authors: Diu Khue Luu, Anh Tuan Nguyen, Ming Jiang, Markus W. Drealan, Jian Xu, Tong Wu, Wing-kin Tam, Wenfeng Zhao, Brian Z. H. Lim, Cynthia K. Overstreet, Qi Zhao, Jonathan Cheng, Edward W. Keefer, Zhi Yang

Abstract: Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed… ▽ More Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed based on the recurrent neural network (RNN) and could simultaneously decode six degree-of-freedom (DOF) from multichannel nerve data in real-time. The decoder's performance is characterized in motor decoding experiments with three human amputees. Results: First, we show the AI agent enables amputees to intuitively control a prosthetic hand with individual finger and wrist movements up to 97-98% accuracy. Second, we demonstrate the AI agent's real-time performance by measuring the reaction time and information throughput in a hand gesture matching task. Third, we investigate the AI agent's long-term uses and show the decoder's robust predictive performance over a 16-month implant duration. Conclusion & significance: Our study demonstrates the potential of AI-enabled nerve technology, underling the next generation of dexterous and intuitive prosthetic hands. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.07596 [pdf, other]

Task-Agnostic Robust Representation Learning

Authors: A. Tuan Nguyen, Ser Nam Lim, Philip Torr

Abstract: It has been reported that deep learning models are extremely vulnerable to small but intentionally chosen perturbations of its input. In particular, a deep network, despite its near-optimal accuracy on the clean images, often mis-classifies an image with a worst-case but humanly imperceptible perturbation (so-called adversarial examples). To tackle this problem, a great amount of research has been… ▽ More It has been reported that deep learning models are extremely vulnerable to small but intentionally chosen perturbations of its input. In particular, a deep network, despite its near-optimal accuracy on the clean images, often mis-classifies an image with a worst-case but humanly imperceptible perturbation (so-called adversarial examples). To tackle this problem, a great amount of research has been done to study the training procedure of a network to improve its robustness. However, most of the research so far has focused on the case of supervised learning. With the increasing popularity of self-supervised learning methods, it is also important to study and improve the robustness of their resulting representation on the downstream tasks. In this paper, we study the problem of robust representation learning with unlabeled data in a task-agnostic manner. Specifically, we first derive an upper bound on the adversarial loss of a prediction model (which is based on the learned representation) on any downstream task, using its loss on the clean data and a robustness regularizer. Moreover, the regularizer is task-independent, thus we propose to minimize it directly during the representation learning phase to make the downstream prediction model more robust. Extensive experiments show that our method achieves preferable adversarial performance compared to relevant baselines. △ Less

Submitted 14 March, 2022; originally announced March 2022.

arXiv:2202.08985 [pdf, ps, other]

Out of Distribution Data Detection Using Dropout Bayesian Neural Networks

Authors: Andre T. Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, James Holt

Abstract: We explore the utility of information contained within a dropout based Bayesian neural network (BNN) for the task of detecting out of distribution (OOD) data. We first show how previous attempts to leverage the randomized embeddings induced by the intermediate layers of a dropout BNN can fail due to the distance metric used. We introduce an alternative approach to measuring embedding uncertainty,… ▽ More We explore the utility of information contained within a dropout based Bayesian neural network (BNN) for the task of detecting out of distribution (OOD) data. We first show how previous attempts to leverage the randomized embeddings induced by the intermediate layers of a dropout BNN can fail due to the distance metric used. We introduce an alternative approach to measuring embedding uncertainty, justify its use theoretically, and demonstrate how incorporating embedding uncertainty improves OOD data identification across three tasks: image classification, language classification, and malware detection. △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2111.13807 [pdf, other]

Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization

Authors: Thanh Nguyen-Tang, Sunil Gupta, A. Tuan Nguyen, Svetha Venkatesh

Abstract: Offline policy learning (OPL) leverages existing data collected a priori for policy optimization without any active exploration. Despite the prevalence and recent interest in this problem, its theoretical and algorithmic foundations in function approximation settings remain under-developed. In this paper, we consider this problem on the axes of distributional shift, optimization, and generalizatio… ▽ More Offline policy learning (OPL) leverages existing data collected a priori for policy optimization without any active exploration. Despite the prevalence and recent interest in this problem, its theoretical and algorithmic foundations in function approximation settings remain under-developed. In this paper, we consider this problem on the axes of distributional shift, optimization, and generalization in offline contextual bandits with neural networks. In particular, we propose a provably efficient offline contextual bandit with neural network function approximation that does not require any functional assumption on the reward. We show that our method provably generalizes over unseen contexts under a milder condition for distributional shift than the existing OPL works. Notably, unlike any other OPL method, our method learns from the offline data in an online manner using stochastic gradient descent, allowing us to leverage the benefits of online learning into an offline setting. Moreover, we show that our method is more computationally efficient and has a better dependence on the effective dimension of the neural network than an online counterpart. Finally, we demonstrate the empirical effectiveness of our method in a range of synthetic and real-world OPL problems. △ Less

Submitted 13 March, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

Comments: A full version at ICLR'22; a preliminary version at Offline RL Workshop at NeurIPS'21; code: https://github.com/thanhnguyentang/offline_neural_bandits

Journal ref: ICLR 2022

arXiv:2108.04081 [pdf, other]

Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints

Authors: Andre T. Nguyen, Edward Raff, Charles Nicholas, James Holt

Abstract: The detection of malware is a critical task for the protection of computing environments. This task often requires extremely low false positive rates (FPR) of 0.01% or even lower, for which modern machine learning has no readily available tools. We introduce the first broad investigation of the use of uncertainty for malware detection across multiple datasets, models, and feature types. We show ho… ▽ More The detection of malware is a critical task for the protection of computing environments. This task often requires extremely low false positive rates (FPR) of 0.01% or even lower, for which modern machine learning has no readily available tools. We introduce the first broad investigation of the use of uncertainty for malware detection across multiple datasets, models, and feature types. We show how ensembling and Bayesian treatments of machine learning methods for static malware detection allow for improved identification of model errors, uncovering of new malware families, and predictive performance under extreme false positive constraints. In particular, we improve the true positive rate (TPR) at an actual realized FPR of 1e-5 from an expected 0.69 for previous methods to 0.80 on the best performing model class on the Sophos industry scale dataset. We additionally demonstrate how previous works have used an evaluation protocol that can lead to misleading results. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Report number: IJCAI-ACD/2021/102

arXiv:2106.07780 [pdf, other]

KL Guided Domain Adaptation

Authors: A. Tuan Nguyen, Toan Tran, Yarin Gal, Philip H. S. Torr, Atılım Güneş Baydin

Abstract: Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in… ▽ More Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same (marginal) distribution over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To improve upon these marginal alignment techniques, in this paper, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches. △ Less

Submitted 14 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: Accepted to ICLR2022

arXiv:2103.13452 [pdf, other]

doi 10.1088/1741-2552/ac2a8d

A Portable, Self-Contained Neuroprosthetic Hand with Deep Learning-Based Finger Control

Authors: Anh Tuan Nguyen, Markus W. Drealan, Diu Khue Luu, Ming Jiang, Jian Xu, Jonathan Cheng, Qi Zhao, Edward W. Keefer, Zhi Yang

Abstract: Objective: Deep learning-based neural decoders have emerged as the prominent approach to enable dexterous and intuitive control of neuroprosthetic hands. Yet few studies have materialized the use of deep learning in clinical settings due to its high computational requirements. Methods: Recent advancements of edge computing devices bring the potential to alleviate this problem. Here we present the… ▽ More Objective: Deep learning-based neural decoders have emerged as the prominent approach to enable dexterous and intuitive control of neuroprosthetic hands. Yet few studies have materialized the use of deep learning in clinical settings due to its high computational requirements. Methods: Recent advancements of edge computing devices bring the potential to alleviate this problem. Here we present the implementation of a neuroprosthetic hand with embedded deep learning-based control. The neural decoder is designed based on the recurrent neural network (RNN) architecture and deployed on the NVIDIA Jetson Nano - a compacted yet powerful edge computing platform for deep learning inference. This enables the implementation of the neuroprosthetic hand as a portable and self-contained unit with real-time control of individual finger movements. Results: The proposed system is evaluated on a transradial amputee using peripheral nerve signals (ENG) with implanted intrafascicular microelectrodes. The experiment results demonstrate the system's capabilities of providing robust, high-accuracy (95-99%) and low-latency (50-120 msec) control of individual finger movements in various laboratory and real-world environments. Conclusion: Modern edge computing platforms enable the effective use of deep learning-based neural decoders for neuroprosthesis control as an autonomous system. Significance: This work helps pioneer the deployment of deep neural networks in clinical applications underlying a new class of wearable biomedical devices with embedded artificial intelligence. △ Less

Submitted 24 March, 2021; originally announced March 2021.

Journal ref: Journal of Neural Engineering 18 (2021) 056051

arXiv:2102.05082 [pdf, other]

Domain Invariant Representation Learning with Domain Density Transformations

Authors: A. Tuan Nguyen, Toan Tran, Yarin Gal, Atılım Güneş Baydin

Abstract: Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to targ… ▽ More Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to target domains. To tackle this problem, a predominant approach is to find and learn some domain-invariant information in order to use it for the prediction task. In this paper, we propose a theoretically grounded method to learn a domain-invariant representation by enforcing the representation network to be invariant under all transformation functions among domains. We also show how to use generative adversarial networks to learn such domain transformations to implement our method in practice. We demonstrate the effectiveness of our method on several widely used datasets for the domain generalization problem, on all of which we achieve competitive results with state-of-the-art models. △ Less

Submitted 15 February, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

Comments: NeurIPS 2021

arXiv:2101.11294 [pdf, other]

Improved algorithms for non-adaptive group testing with consecutive positives

Authors: Thach V. Bui, Mahdi Cheraghchi, An T. H. Nguyen, Thuc D. Nguyen

Abstract: The goal of group testing is to efficiently identify a few specific items, called positives, in a large population of items via tests. A test is an action on a subset of items which returns positive if the subset contains at least one positive and negative otherwise. In non-adaptive group testing, all tests are fixed in advance and can be performed in parallel. In this work, we consider non-adapti… ▽ More The goal of group testing is to efficiently identify a few specific items, called positives, in a large population of items via tests. A test is an action on a subset of items which returns positive if the subset contains at least one positive and negative otherwise. In non-adaptive group testing, all tests are fixed in advance and can be performed in parallel. In this work, we consider non-adaptive group testing with consecutive positives in which the items are linearly ordered and the positives are consecutive in that order. We present two contributions here. The first is the direct use of a binary code to construct measurement matrices compared to the use of Gray code in the state-of-the-art work, which is a rearrangement of the binary code, when the maximum number of consecutive positives is known. This leads to a reduction in decoding time in practice. The second one is efficient designs to identify positives when the number of consecutive positives is known. To the best of our knowledge, this setting has not been surveyed yet. Our simulations verify the efficiency of our proposed designs. In particular, it only requires up to $300$ tests to identify up to $100$ positives in a set of $2^{32} \approx 4.3\mathrm{B}$ items in less than $300$ nanoseconds. When the maximum number of consecutive positives is known, the simulations validate the superiority of our proposed design in decoding compared to the state-of-the-art work. Moreover, when the number of consecutive positives is known, the number of tests and the decoding time are almost reduced half. △ Less

Submitted 5 November, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

arXiv:2010.01891 [pdf, other]

A Pilot Study of Text-to-SQL Semantic Parsing for Vietnamese

Authors: Anh Tuan Nguyen, Mai Hoang Dao, Dat Quoc Nguyen

Abstract: Semantic parsing is an important NLP task. However, Vietnamese is a low-resource language in this research area. In this paper, we present the first public large-scale Text-to-SQL semantic parsing dataset for Vietnamese. We extend and evaluate two strong semantic parsing baselines EditSQL (Zhang et al., 2019) and IRNet (Guo et al., 2019) on our dataset. We compare the two baselines with key config… ▽ More Semantic parsing is an important NLP task. However, Vietnamese is a low-resource language in this research area. In this paper, we present the first public large-scale Text-to-SQL semantic parsing dataset for Vietnamese. We extend and evaluate two strong semantic parsing baselines EditSQL (Zhang et al., 2019) and IRNet (Guo et al., 2019) on our dataset. We compare the two baselines with key configurations and find that: automatic Vietnamese word segmentation improves the parsing results of both baselines; the normalized pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema linking; latent syntactic features extracted from a neural dependency parser for Vietnamese also improve the results; and the monolingual language model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher performances than the recent best multilingual language model XLM-R (Conneau et al., 2020). △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: EMNLP 2020 (Findings)

arXiv:2009.05147 [pdf, other]

Practical Cross-modal Manifold Alignment for Grounded Language

Authors: Andre T. Nguyen, Luke E. Richards, Gaoussou Youssouf Kebe, Edward Raff, Kasra Darvish, Frank Ferraro, Cynthia Matuszek

Abstract: We propose a cross-modality manifold alignment procedure that leverages triplet loss to jointly learn consistent, multi-modal embeddings of language-based concepts of real-world items. Our approach learns these embeddings by sampling triples of anchor, positive, and negative data points from RGB-depth images and their natural language descriptions. We show that our approach can benefit from, but d… ▽ More We propose a cross-modality manifold alignment procedure that leverages triplet loss to jointly learn consistent, multi-modal embeddings of language-based concepts of real-world items. Our approach learns these embeddings by sampling triples of anchor, positive, and negative data points from RGB-depth images and their natural language descriptions. We show that our approach can benefit from, but does not require, post-processing steps such as Procrustes analysis, in contrast to some of our baselines which require it for reasonable performance. We demonstrate the effectiveness of our approach on two datasets commonly used to develop robotic-based grounded language learning systems, where our approach outperforms four baselines, including a state-of-the-art approach, across five evaluation metrics. △ Less

Submitted 1 September, 2020; originally announced September 2020.

arXiv:2008.12854 [pdf, other]

TATL at W-NUT 2020 Task 2: A Transformer-based Baseline System for Identification of Informative COVID-19 English Tweets

Authors: Anh Tuan Nguyen

Abstract: As the COVID-19 outbreak continues to spread throughout the world, more and more information about the pandemic has been shared publicly on social media. For example, there are a huge number of COVID-19 English Tweets daily on Twitter. However, the majority of those Tweets are uninformative, and hence it is important to be able to automatically select only the informative ones for downstream appli… ▽ More As the COVID-19 outbreak continues to spread throughout the world, more and more information about the pandemic has been shared publicly on social media. For example, there are a huge number of COVID-19 English Tweets daily on Twitter. However, the majority of those Tweets are uninformative, and hence it is important to be able to automatically select only the informative ones for downstream applications. In this short paper, we present our participation in the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. Inspired by the recent advances in pretrained Transformer language models, we propose a simple yet effective baseline for the task. Despite its simplicity, our proposed approach shows very competitive results in the leaderboard as we ranked 8 over 56 teams participated in total. △ Less

Submitted 28 August, 2020; originally announced August 2020.

arXiv:2006.14222 [pdf, other]

Set Based Stochastic Subsampling

Authors: Bruno Andreis, Seanie Lee, A. Tuan Nguyen, Juho Lee, Eunho Yang, Sung Ju Hwang

Abstract: Deep models are designed to operate on huge volumes of high dimensional data such as images. In order to reduce the volume of data these models must process, we propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an \textit{arbitrary} downstream task network (e.g. classifier). In the first stage, we efficiently subsample \textit{candidate elements} usin… ▽ More Deep models are designed to operate on huge volumes of high dimensional data such as images. In order to reduce the volume of data these models must process, we propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an \textit{arbitrary} downstream task network (e.g. classifier). In the first stage, we efficiently subsample \textit{candidate elements} using conditionally independent Bernoulli random variables by capturing coarse grained global information using set encoding functions, followed by conditionally dependent autoregressive subsampling of the candidate elements using Categorical random variables by modeling pair-wise interactions using set attention networks in the second stage. We apply our method to feature and instance selection and show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification. Additionally, for nonparametric models such as Neural Processes that require to leverage the whole training data at inference time, we show that our method enhances the scalability of these models. △ Less

Submitted 30 May, 2022; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: 20 pages

arXiv:2006.12777 [pdf, other]

Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning

Authors: A. Tuan Nguyen, Hyewon Jeong, Eunho Yang, Sung Ju Hwang

Abstract: Although recent multi-task learning methods have shown to be effective in improving the generalization of deep neural networks, they should be used with caution for safety-critical applications, such as clinical risk prediction. This is because even if they achieve improved task-average performance, they may still yield degraded performance on individual tasks, which may be critical (e.g., predict… ▽ More Although recent multi-task learning methods have shown to be effective in improving the generalization of deep neural networks, they should be used with caution for safety-critical applications, such as clinical risk prediction. This is because even if they achieve improved task-average performance, they may still yield degraded performance on individual tasks, which may be critical (e.g., prediction of mortality risk). Existing asymmetric multi-task learning methods tackle this negative transfer problem by performing knowledge transfer from tasks with low loss to tasks with high loss. However, using loss as a measure of reliability is risky since it could be a result of overfitting. In the case of time-series prediction tasks, knowledge learned for one task (e.g., predicting the sepsis onset) at a specific timestep may be useful for learning another task (e.g., prediction of mortality) at a later timestep, but lack of loss at each timestep makes it difficult to measure the reliability at each timestep. To capture such dynamically changing asymmetric relationships between tasks in time-series data, we propose a novel temporal asymmetric multi-task learning model that performs knowledge transfer from certain tasks/timesteps to relevant uncertain tasks, based on feature-level uncertainty. We validate our model on multiple clinical risk prediction tasks against various deep learning models for time-series prediction, which our model significantly outperforms, without any sign of negative transfer. Further qualitative analysis of learned knowledge graphs by clinicians shows that they are helpful in analyzing the predictions of the model. Our final code is available at https://github.com/anhtuan5696/TPAMTL. △ Less

Submitted 18 February, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

Comments: AAAI 2021. The first two authors contributed equally to this work. 10 pages, 4 figures, 4 tables

arXiv:2005.10200 [pdf, other]

BERTweet: A pre-trained language model for English Tweets

Authors: Dat Quoc Nguyen, Thanh Vu, Anh Tuan Nguyen

Abstract: We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is trained using the RoBERTa pre-training procedure (Liu et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results tha… ▽ More We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is trained using the RoBERTa pre-training procedure (Liu et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet under the MIT License to facilitate future research and applications on Tweet data. Our BERTweet is available at https://github.com/VinAIResearch/BERTweet △ Less

Submitted 5 October, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

Comments: In Proceedings of EMNLP 2020: System Demonstrations

arXiv:2003.00744 [pdf, other]

PhoBERT: Pre-trained language models for Vietnamese

Authors: Dat Quoc Nguyen, Anh Tuan Nguyen

Abstract: We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public large-scale monolingual language models pre-trained for Vietnamese. Experimental results show that PhoBERT consistently outperforms the recent best pre-trained multilingual model XLM-R (Conneau et al., 2020) and improves the state-of-the-art in multiple Vietnamese-specific NLP tasks including Part-of-speech tagg… ▽ More We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public large-scale monolingual language models pre-trained for Vietnamese. Experimental results show that PhoBERT consistently outperforms the recent best pre-trained multilingual model XLM-R (Conneau et al., 2020) and improves the state-of-the-art in multiple Vietnamese-specific NLP tasks including Part-of-speech tagging, Dependency parsing, Named-entity recognition and Natural language inference. We release PhoBERT to facilitate future research and downstream applications for Vietnamese NLP. Our PhoBERT models are available at https://github.com/VinAIResearch/PhoBERT △ Less

Submitted 5 October, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: EMNLP 2020 (Findings)

arXiv:1911.04636 [pdf, other]

Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory

Authors: Arash Rahnama, Andre T. Nguyen, Edward Raff

Abstract: Deep neural networks (DNNs) are vulnerable to subtle adversarial perturbations applied to the input. These adversarial perturbations, though imperceptible, can easily mislead the DNN. In this work, we take a control theoretic approach to the problem of robustness in DNNs. We treat each individual layer of the DNN as a nonlinear dynamical system and use Lyapunov theory to prove stability and robust… ▽ More Deep neural networks (DNNs) are vulnerable to subtle adversarial perturbations applied to the input. These adversarial perturbations, though imperceptible, can easily mislead the DNN. In this work, we take a control theoretic approach to the problem of robustness in DNNs. We treat each individual layer of the DNN as a nonlinear dynamical system and use Lyapunov theory to prove stability and robustness locally. We then proceed to prove stability and robustness globally for the entire DNN. We develop empirically tight bounds on the response of the output layer, or any hidden layer, to adversarial perturbations added to the input, or the input of hidden layers. Recent works have proposed spectral norm regularization as a solution for improving robustness against l2 adversarial attacks. Our results give new insights into how spectral norm regularization can mitigate the adversarial effects. Finally, we evaluate the power of our approach on a variety of data sets and network architectures and against some of the well-known adversarial attacks. △ Less

Submitted 11 November, 2019; originally announced November 2019.

arXiv:1911.02673 [pdf, other]

Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial Resolutions

Authors: Emily L. Aiken, Andre T. Nguyen, Mauricio Santillana

Abstract: We introduce the use of a Gated Recurrent Unit (GRU) for influenza prediction at the state- and city-level in the US, and experiment with the inclusion of real-time flu-related Internet search data. We find that a GRU has lower prediction error than current state-of-the-art methods for data-driven influenza prediction at time horizons of over two weeks. In contrast with other machine learning appr… ▽ More We introduce the use of a Gated Recurrent Unit (GRU) for influenza prediction at the state- and city-level in the US, and experiment with the inclusion of real-time flu-related Internet search data. We find that a GRU has lower prediction error than current state-of-the-art methods for data-driven influenza prediction at time horizons of over two weeks. In contrast with other machine learning approaches, the inclusion of real-time Internet search data does not improve GRU predictions. △ Less

Submitted 13 November, 2019; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract; Added Footer

arXiv:1910.04753 [pdf]

Would a File by Any Other Name Seem as Malicious?

Authors: Andre T. Nguyen, Edward Raff, Aaron Sant-Miller

Abstract: Successful malware attacks on information technology systems can cause millions of dollars in damage, the exposure of sensitive and private information, and the irreversible destruction of data. Anti-virus systems that analyze a file's contents use a combination of static and dynamic analysis to detect and remove/remediate such malware. However, examining a file's entire contents is not always pos… ▽ More Successful malware attacks on information technology systems can cause millions of dollars in damage, the exposure of sensitive and private information, and the irreversible destruction of data. Anti-virus systems that analyze a file's contents use a combination of static and dynamic analysis to detect and remove/remediate such malware. However, examining a file's entire contents is not always possible in practice, as the volume and velocity of incoming data may be too high, or access to the underlying file contents may be restricted or unavailable. If it were possible to obtain estimates of a file's relative likelihood of being malicious without looking at the file contents, we could better prioritize file processing order and aid analysts in situations where a file is unavailable. In this work, we demonstrate that file names can contain information predictive of the presence of malware in a file. In particular, we show the effectiveness of a character-level convolutional neural network at predicting malware status using file names on Endgame's EMBER malware detection benchmark dataset. △ Less

Submitted 9 October, 2019; originally announced October 2019.

arXiv:1908.09219 [pdf, other]

Heterogeneous Relational Kernel Learning

Authors: Andre T. Nguyen, Edward Raff

Abstract: Recent work has developed Bayesian methods for the automatic statistical analysis and description of single time series as well as of homogeneous sets of time series data. We extend prior work to create an interpretable kernel embedding for heterogeneous time series. Our method adds practically no computational cost compared to prior results by leveraging previously discarded intermediate results.… ▽ More Recent work has developed Bayesian methods for the automatic statistical analysis and description of single time series as well as of homogeneous sets of time series data. We extend prior work to create an interpretable kernel embedding for heterogeneous time series. Our method adds practically no computational cost compared to prior results by leveraging previously discarded intermediate results. We show the practical utility of our method by leveraging the learned embeddings for clustering, pattern discovery, and anomaly detection. These applications are beyond the ability of prior relational kernel learning approaches. △ Less

Submitted 24 August, 2019; originally announced August 2019.

Comments: MileTS '19: 5th KDD Workshop on Mining and Learning from Time Series

arXiv:1907.07732 [pdf, other]

Connecting Lyapunov Control Theory to Adversarial Attacks

Authors: Arash Rahnama, Andre T. Nguyen, Edward Raff

Abstract: Significant work is being done to develop the math and tools necessary to build provable defenses, or at least bounds, against adversarial attacks of neural networks. In this work, we argue that tools from control theory could be leveraged to aid in defending against such attacks. We do this by example, building a provable defense against a weaker adversary. This is done so we can focus on the mec… ▽ More Significant work is being done to develop the math and tools necessary to build provable defenses, or at least bounds, against adversarial attacks of neural networks. In this work, we argue that tools from control theory could be leveraged to aid in defending against such attacks. We do this by example, building a provable defense against a weaker adversary. This is done so we can focus on the mechanisms of control theory, and illuminate its intrinsic value. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: 8 pages, 3 figures, AdvML'19: Workshop on Adversarial Learning Methods for Machine Learning and Data Mining at KDD

arXiv:1904.04710 [pdf]

Secure Biometric-based Remote Authentication Protocol using Chebyshev Polynomials and Fuzzy Extractor

Authors: Thi Ai Thao Nguyen, Tran Khanh Dang, Quynh Chi Truong, Dinh Thanh Nguyen

Abstract: In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR). In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR). △ Less

Submitted 9 April, 2019; originally announced April 2019.

Comments: RCCIE17

arXiv:1904.00264 [pdf]

A New Biometric Template Protection using Random Orthonormal Projection and Fuzzy Commitment

Authors: Thi Ai Thao Nguyen, Tran Khanh Dang, Dinh Thanh Nguyen

Abstract: Biometric template protection is one of most essential parts in putting a biometric-based authentication system into practice. There have been many researches proposing different solutions to secure biometric templates of users. They can be categorized into two approaches: feature transformation and biometric cryptosystem. However, no one single template protection approach can satisfy all the req… ▽ More Biometric template protection is one of most essential parts in putting a biometric-based authentication system into practice. There have been many researches proposing different solutions to secure biometric templates of users. They can be categorized into two approaches: feature transformation and biometric cryptosystem. However, no one single template protection approach can satisfy all the requirements of a secure biometric-based authentication system. In this work, we will propose a novel hybrid biometric template protection which takes benefits of both approaches while preventing their limitations. The experiments demonstrate that the performance of the system can be maintained with the support of a new random orthonormal project technique, which reduces the computational complexity while preserving the accuracy. Meanwhile, the security of biometric templates is guaranteed by employing fuzzy commitment protocol. △ Less

Submitted 30 March, 2019; originally announced April 2019.

Comments: 11 pages, 6 figures, accepted for IMCOM 2019

arXiv:1812.02885 [pdf]

Adversarial Attacks, Regression, and Numerical Stability Regularization

Authors: Andre T. Nguyen, Edward Raff

Abstract: Adversarial attacks against neural networks in a regression setting are a critical yet understudied problem. In this work, we advance the state of the art by investigating adversarial attacks against regression networks and by formulating a more effective defense against these attacks. In particular, we take the perspective that adversarial attacks are likely caused by numerical instability in lea… ▽ More Adversarial attacks against neural networks in a regression setting are a critical yet understudied problem. In this work, we advance the state of the art by investigating adversarial attacks against regression networks and by formulating a more effective defense against these attacks. In particular, we take the perspective that adversarial attacks are likely caused by numerical instability in learned functions. We introduce a stability inducing, regularization based defense against adversarial attacks in the regression setting. Our new and easy to implement defense is shown to outperform prior approaches and to improve the numerical stability of learned functions. △ Less

Submitted 6 December, 2018; originally announced December 2018.

Comments: Presented at the AAAI 2019 Workshop on Engineering Dependable and Secure Machine Learning Systems

arXiv:1810.07834 [pdf, other]

Superimposed Frame Synchronization Optimization for Finite Blocklength Regime

Authors: Alex The Phuong Nguyen, Raphaël Le Bidan, Frédéric Guilloud

Abstract: Considering a short frame length, which is typical in Ultra-Reliable Low-Latency and massive Machine Type Communications, a trade-off exists between improving the performance of frame synchronization (FS) and improving the performance of information throughput. In this paper, we consider the case of continuous transmission over AWGN channels where the synchronization sequence is superimposed to th… ▽ More Considering a short frame length, which is typical in Ultra-Reliable Low-Latency and massive Machine Type Communications, a trade-off exists between improving the performance of frame synchronization (FS) and improving the performance of information throughput. In this paper, we consider the case of continuous transmission over AWGN channels where the synchronization sequence is superimposed to the data symbols, as opposed to being added as a frame header. The advantage of this superposition is that the synchronization length is as long as the frame length. On the other hand, its power has to be traded-off not to degrade the code performance. We first provide the analysis of FS error probability using an approximation of the probability distribution of the overall received signal. Numerical evaluations show the tightness of our analytic results. Then we optimize the fraction of power allocated to the superimposed synchronization sequence in order to maximize the probability of receiving a frame without synchronization errors nor decoding errors. Comparison of the theoretical model predictions to a practical setup show very close optimal power allocation policies. △ Less

Submitted 9 March, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: to appear at 2019 WCNC Workshop on Mathematical Tools and technologies for IoT and mMTC Networks Modeling (MoTION)

arXiv:1805.04168 [pdf, other]

doi 10.1109/TBME.2018.2885523

Achieving Super-Resolution with Redundant Sensing

Authors: Diu Khue Luu, Anh Tuan Nguyen, Zhi Yang

Abstract: Analog-to-digital (quantization) and digital-to-analog (de-quantization) conversion are fundamental operations of many information processing systems. In practice, the precision of these operations is always bounded, first by the random mismatch error (ME) occurred during system implementation, and subsequently by the intrinsic quantization error (QE) determined by the system architecture itself.… ▽ More Analog-to-digital (quantization) and digital-to-analog (de-quantization) conversion are fundamental operations of many information processing systems. In practice, the precision of these operations is always bounded, first by the random mismatch error (ME) occurred during system implementation, and subsequently by the intrinsic quantization error (QE) determined by the system architecture itself. In this manuscript, we present a new mathematical interpretation of the previously proposed redundant sensing (RS) architecture that not only suppresses ME but also allows achieving an effective resolution exceeding the system's intrinsic resolution, i.e. super-resolution (SR). SR is enabled by an endogenous property of redundant structures regarded as "code diffusion" where the references' value spreads into the neighbor sample space as a result of ME. The proposed concept opens the possibility for a wide range of applications in low-power fully-integrated sensors and devices where the cost-accuracy trade-off is inevitable. △ Less

Submitted 10 May, 2018; originally announced May 2018.

Journal ref: IEEE Transactions on Biomedical Engineering (2018)

arXiv:1802.05686 [pdf, other]

A Bio-inspired Redundant Sensing Architecture

Authors: Anh Tuan Nguyen, Jian Xu, Zhi Yang

Abstract: Sensing is the process of deriving signals from the environment that allows artificial systems to interact with the physical world. The Shannon theorem specifies the maximum rate at which information can be acquired. However, this upper bound is hard to achieve in many man-made systems. The biological visual systems, on the other hand, have highly efficient signal representation and processing mec… ▽ More Sensing is the process of deriving signals from the environment that allows artificial systems to interact with the physical world. The Shannon theorem specifies the maximum rate at which information can be acquired. However, this upper bound is hard to achieve in many man-made systems. The biological visual systems, on the other hand, have highly efficient signal representation and processing mechanisms that allow precise sensing. In this work, we argue that redundancy is one of the critical characteristics for such superior performance. We show architectural advantages by utilizing redundant sensing, including correction of mismatch error and significant precision enhancement. For a proof-of-concept demonstration, we have designed a heuristic-based analog-to-digital converter - a zero-dimensional quantizer. Through Monte Carlo simulation with the error probabilistic distribution as a priori, the performance approaching the Shannon limit is feasible. In actual measurements without knowing the error distribution, we observe at least 2-bit extra precision. The results may also help explain biological processes including the dominance of binocular vision, the functional roles of the fixational eye movements, and the structural mechanisms allowing hyperacuity. △ Less

Submitted 15 February, 2018; originally announced February 2018.

Journal ref: (2016) A Bio-inspired Redundant Sensing Architecture. Advances in Neural Information Processing Systems (NIPS), Dec. 2016

arXiv:1802.05324 [pdf, ps, other]

doi 10.1162/neco_a_01166

Advancing System Performance with Redundancy: From Biological to Artificial Designs

Authors: Anh Tuan Nguyen, Jian Xu, Diu Khue Luu, Qi Zhao, Zhi Yang

Abstract: Redundancy is a fundamental characteristic of many biological processes such as those in the genetic, visual, muscular and nervous system; yet its function has not been fully understood. The conventional interpretation of redundancy is that it serves as a fault-tolerance mechanism, which leads to redundancy's de facto application in man-made systems for reliability enhancement. On the contrary, ou… ▽ More Redundancy is a fundamental characteristic of many biological processes such as those in the genetic, visual, muscular and nervous system; yet its function has not been fully understood. The conventional interpretation of redundancy is that it serves as a fault-tolerance mechanism, which leads to redundancy's de facto application in man-made systems for reliability enhancement. On the contrary, our previous works have demonstrated an example where redundancy can be engineered solely for enhancing other aspects of the system, namely accuracy and precision. This design was inspired by the binocular structure of the human vision which we believe may share a similar operation. In this paper, we present a unified theory describing how such utilization of redundancy is feasible through two complementary mechanisms: representational redundancy (RPR) and entangled redundancy (ETR). Besides the previous works, we point out two additional examples where our new understanding of redundancy can be applied to justify a system's superior performance. One is the human musculoskeletal system (HMS) - a biological instance, and one is the deep residual neural network (ResNet) - an artificial counterpart. We envision that our theory would provide a framework for the future development of bio-inspired redundant artificial systems as well as assist the studies of the fundamental mechanisms governing various biological processes. △ Less

Submitted 14 February, 2018; originally announced February 2018.

Journal ref: Neural Computation, MIT Press, 2019

arXiv:1611.06792 [pdf, other]

Neural Information Retrieval: A Literature Review

Authors: Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, Matthew Lease

Abstract: A recent "third wave" of Neural Network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, this new NN research is often referred to as deep learning. Stemming from this tide of NN work, a number of researchers… ▽ More A recent "third wave" of Neural Network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, this new NN research is often referred to as deep learning. Stemming from this tide of NN work, a number of researchers have recently begun to investigate NN approaches to Information Retrieval (IR). While deep NNs have yet to achieve the same level of success in IR as seen in other areas, the recent surge of interest and work in NNs for IR suggest that this state of affairs may be quickly changing. In this work, we survey the current landscape of Neural IR research, paying special attention to the use of learned representations of queries and documents (i.e., neural embeddings). We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research. △ Less

Submitted 3 March, 2017; v1 submitted 17 November, 2016; originally announced November 2016.

Comments: 44 pages

Showing 1–45 of 45 results for author: Nguyen, A T