subscribe to arXiv mailings

The 2024 Brain Tumor Segmentation (BraTS) Challenge: Glioma Segmentation on Post-treatment MRI

Authors: Maria Correia de Verdier, Rachit Saluja, Louis Gagnon, Dominic LaBella, Ujjwall Baid, Nourel Hoda Tahon, Martha Foltyn-Dumitru, Jikai Zhang, Maram Alafif, Saif Baig, Ken Chang, Gennaro D'Anna, Lisa Deptula, Diviya Gupta, Muhammad Ammar Haider, Ali Hussain, Michael Iv, Marinos Kontzialis, Paul Manning, Farzan Moodi, Teresa Nunes, Aaron Simon, Nico Sollmann, David Vu, Maruf Adewole , et al. (60 additional authors not shown)

Abstract: Gliomas are the most common malignant primary brain tumors in adults and one of the deadliest types of cancer. There are many challenges in treatment and monitoring due to the genetic diversity and high intrinsic heterogeneity in appearance, shape, histology, and treatment response. Treatments include surgery, radiation, and systemic therapies, with magnetic resonance imaging (MRI) playing a key r… ▽ More Gliomas are the most common malignant primary brain tumors in adults and one of the deadliest types of cancer. There are many challenges in treatment and monitoring due to the genetic diversity and high intrinsic heterogeneity in appearance, shape, histology, and treatment response. Treatments include surgery, radiation, and systemic therapies, with magnetic resonance imaging (MRI) playing a key role in treatment planning and post-treatment longitudinal assessment. The 2024 Brain Tumor Segmentation (BraTS) challenge on post-treatment glioma MRI will provide a community standard and benchmark for state-of-the-art automated segmentation models based on the largest expert-annotated post-treatment glioma MRI dataset. Challenge competitors will develop automated segmentation models to predict four distinct tumor sub-regions consisting of enhancing tissue (ET), surrounding non-enhancing T2/fluid-attenuated inversion recovery (FLAIR) hyperintensity (SNFH), non-enhancing tumor core (NETC), and resection cavity (RC). Models will be evaluated on separate validation and test datasets using standardized performance metrics utilized across the BraTS 2024 cluster of challenges, including lesion-wise Dice Similarity Coefficient and Hausdorff Distance. Models developed during this challenge will advance the field of automated MRI segmentation and contribute to their integration into clinical practice, ultimately enhancing patient care. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 10 pages, 4 figures, 1 table

arXiv:2403.16685 [pdf, other]

ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

Authors: Nhat M. Hoang, Xuan Long Do, Duc Anh Do, Duc Anh Vu, Luu Anh Tuan

Abstract: The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively d… ▽ More The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. Our model consists of three modules: a (i) Target Group Generator to generate the targeted demographic group(s) of a given post; an (ii) Encoder-Decoder Model in which the encoder focuses on detecting implicit toxic speech and is boosted by a (iii) Teacher Classifier via knowledge distillation, and the decoder generates the necessary explanation. ToXCL achieves new state-of-the-art effectiveness, and outperforms baselines significantly. △ Less

Submitted 20 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Accepted at NAACL 2024 (Main Conference)

arXiv:2403.09302 [pdf, other]

StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images

Authors: Robert Jewsbury, Ruoyu Wang, Abhir Bhalerao, Nasir Rajpoot, Quoc Dang Vu

Abstract: Stain normalization algorithms aim to transform the color and intensity characteristics of a source multi-gigapixel histology image to match those of a target image, mitigating inconsistencies in the appearance of stains used to highlight cellular components in the images. We propose a new approach, StainFuser, which treats this problem as a style transfer task using a novel Conditional Latent Dif… ▽ More Stain normalization algorithms aim to transform the color and intensity characteristics of a source multi-gigapixel histology image to match those of a target image, mitigating inconsistencies in the appearance of stains used to highlight cellular components in the images. We propose a new approach, StainFuser, which treats this problem as a style transfer task using a novel Conditional Latent Diffusion architecture, eliminating the need for handcrafted color components. With this method, we curate SPI-2M the largest stain normalization dataset to date of over 2 million histology images with neural style transfer for high-quality transformations. Trained on this data, StainFuser outperforms current state-of-the-art deep learning and handcrafted methods in terms of the quality of normalized images and in terms of downstream model performance on the CoNIC dataset. △ Less

Submitted 12 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

arXiv:2402.19296 [pdf]

An AI based Digital Score of Tumour-Immune Microenvironment Predicts Benefit to Maintenance Immunotherapy in Advanced Oesophagogastric Adenocarcinoma

Authors: Quoc Dang Vu, Caroline Fong, Anderley Gordon, Tom Lund, Tatiany L Silveira, Daniel Rodrigues, Katharina von Loga, Shan E Ahmed Raza, David Cunningham, Nasir Rajpoot

Abstract: Gastric and oesophageal (OG) cancers are the leading causes of cancer mortality worldwide. In OG cancers, recent studies have showed that PDL1 immune checkpoint inhibitors (ICI) in combination with chemotherapy improves patient survival. However, our understanding of the tumour immune microenvironment in OG cancers remains limited. In this study, we interrogate multiplex immunofluorescence (mIF) i… ▽ More Gastric and oesophageal (OG) cancers are the leading causes of cancer mortality worldwide. In OG cancers, recent studies have showed that PDL1 immune checkpoint inhibitors (ICI) in combination with chemotherapy improves patient survival. However, our understanding of the tumour immune microenvironment in OG cancers remains limited. In this study, we interrogate multiplex immunofluorescence (mIF) images taken from patients with advanced Oesophagogastric Adenocarcinoma (OGA) who received first-line fluoropyrimidine and platinum-based chemotherapy in the PLATFORM trial (NCT02678182) to predict the efficacy of the treatment and to explore the biological basis of patients responding to maintenance durvalumab (PDL1 inhibitor). Our proposed Artificial Intelligence (AI) based marker successfully identified responder from non-responder (p < 0.05) as well as those who could potentially benefit from ICI with statistical significance (p < 0.05) for both progression free and overall survival. Our findings suggest that T cells that express FOXP3 seem to heavily influence the patient treatment response and survival outcome. We also observed that higher levels of CD8+PD1+ cells are consistently linked to poor prognosis for both OS and PFS, regardless of ICI. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2401.14268 [pdf, other]

GPTVoiceTasker: LLM-Powered Virtual Assistant for Smartphone

Authors: Minh Duc Vu, Han Wang, Zhuang Li, Jieshan Chen, Shengdong Zhao, Zhenchang Xing, Chunyang Chen

Abstract: Virtual assistants have the potential to play an important role in helping users achieves different tasks. However, these systems face challenges in their real-world usability, characterized by inefficiency and struggles in grasping user intentions. Leveraging recent advances in Large Language Models (LLMs), we introduce GptVoiceTasker, a virtual assistant poised to enhance user experiences and ta… ▽ More Virtual assistants have the potential to play an important role in helping users achieves different tasks. However, these systems face challenges in their real-world usability, characterized by inefficiency and struggles in grasping user intentions. Leveraging recent advances in Large Language Models (LLMs), we introduce GptVoiceTasker, a virtual assistant poised to enhance user experiences and task efficiency on mobile devices. GptVoiceTasker excels at intelligently deciphering user commands and executing relevant device interactions to streamline task completion. The system continually learns from historical user commands to automate subsequent usages, further enhancing execution efficiency. Our experiments affirm GptVoiceTasker's exceptional command interpretation abilities and the precision of its task automation module. In our user study, GptVoiceTasker boosted task efficiency in real-world scenarios by 34.85%, accompanied by positive participant feedback. We made GptVoiceTasker open-source, inviting further research into LLMs utilization for diverse tasks through prompt engineering and leveraging user usage data to improve efficiency. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2312.17205 [pdf, other]

EFHQ: Multi-purpose ExtremePose-Face-HQ dataset

Authors: Trung Tuan Dao, Duc Hong Vu, Cuong Pham, Anh Tran

Abstract: The existing facial datasets, while having plentiful images at near frontal views, lack images with extreme head poses, leading to the downgraded performance of deep learning models when dealing with profile or pitched faces. This work aims to address this gap by introducing a novel dataset named Extreme Pose Face High-Quality Dataset (EFHQ), which includes a maximum of 450k high-quality images of… ▽ More The existing facial datasets, while having plentiful images at near frontal views, lack images with extreme head poses, leading to the downgraded performance of deep learning models when dealing with profile or pitched faces. This work aims to address this gap by introducing a novel dataset named Extreme Pose Face High-Quality Dataset (EFHQ), which includes a maximum of 450k high-quality images of faces at extreme poses. To produce such a massive dataset, we utilize a novel and meticulous dataset processing pipeline to curate two publicly available datasets, VFHQ and CelebV-HQ, which contain many high-resolution face videos captured in various settings. Our dataset can complement existing datasets on various facial-related tasks, such as facial synthesis with 2D/3D-aware GAN, diffusion-based text-to-image face generation, and face reenactment. Specifically, training with EFHQ helps models generalize well across diverse poses, significantly improving performance in scenarios involving extreme views, confirmed by extensive experiments. Additionally, we utilize EFHQ to define a challenging cross-view face verification benchmark, in which the performance of SOTA face recognition models drops 5-37% compared to frontal-to-frontal scenarios, aiming to stimulate studies on face recognition under severe pose conditions in the wild. △ Less

Submitted 11 April, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: Project Page: https://bomcon123456.github.io/efhq/

arXiv:2312.09982 [pdf, other]

ACPO: AI-Enabled Compiler-Driven Program Optimization

Authors: Amir H. Ashouri, Muhammad Asif Manzoor, Duc Minh Vu, Raymond Zhang, Ziwen Wang, Angel Zhang, Bryan Chan, Tomasz S. Czajkowski, Yaoqing Gao

Abstract: The key to performance optimization of a program is to decide correctly when a certain transformation should be applied by a compiler. This is an ideal opportunity to apply machine-learning models to speed up the tuning process; while this realization has been around since the late 90s, only recent advancements in ML enabled a practical application of ML to compilers as an end-to-end framework.… ▽ More The key to performance optimization of a program is to decide correctly when a certain transformation should be applied by a compiler. This is an ideal opportunity to apply machine-learning models to speed up the tuning process; while this realization has been around since the late 90s, only recent advancements in ML enabled a practical application of ML to compilers as an end-to-end framework. This paper presents ACPO: \textbf{\underline{A}}I-Enabled \textbf{\underline{C}}ompiler-driven \textbf{\underline{P}}rogram \textbf{\underline{O}}ptimization; a novel framework to provide LLVM with simple and comprehensive tools to benefit from employing ML models for different optimization passes. We first showcase the high-level view, class hierarchy, and functionalities of ACPO and subsequently, demonstrate a couple of use cases of ACPO by ML-enabling the Loop Unroll and Function Inlining passes and describe how ACPO can be leveraged to optimize other passes. Experimental results reveal that ACPO model for Loop Unroll is able to gain on average 4\% compared to LLVM's O3 optimization when deployed on Polybench. Furthermore, by adding the Inliner model as well, ACPO is able to provide up to 4.5\% and 2.4\% on Polybench and Cbench compared with LLVM's O3 optimization, respectively. △ Less

Submitted 11 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: Preprint version of ACPO (12 pages)

ACM Class: I.2.5; D.3.0; I.2.6

arXiv:2312.02227 [pdf, other]

Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation

Authors: Cong-Duy Nguyen, Thong Nguyen, Duc Anh Vu, Luu Anh Tuan

Abstract: The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming… ▽ More The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming positive and negative pairs, neglecting the variation in sentiment scores within the same class. Additionally, they fail to capture the significance of unimodal representations in the fusion vector. To address these limitations, we introduce a framework called Supervised Angular-based Contrastive Learning for Multimodal Sentiment Analysis. This framework aims to enhance discrimination and generalizability of the multimodal representation and overcome biases in the fusion vector's modality. Our experimental results, along with visualizations on two widely used datasets, demonstrate the effectiveness of our approach. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2312.01661 [pdf, other]

ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions

Authors: Phuoc Pham Van Long, Duc Anh Vu, Nhat M. Hoang, Xuan Long Do, Anh Tuan Luu

Abstract: Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs)… ▽ More Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs) such as ChatGPT have excelled in many NLP tasks involving logical and arithmetic reasoning. Nonetheless, their applications in generating educational questions are underutilized, especially in the field of mathematics. To bridge this gap, we take the first step to conduct an in-depth analysis of ChatGPT in generating pre-university math questions. Our analysis is categorized into two main settings: context-aware and context-unaware. In the context-aware setting, we evaluate ChatGPT on existing math question-answering benchmarks covering elementary, secondary, and ternary classes. In the context-unaware setting, we evaluate ChatGPT in generating math questions for each lesson from pre-university math curriculums that we crawl. Our crawling results in TopicMath, a comprehensive and novel collection of pre-university math curriculums collected from 121 math topics and 428 lessons from elementary, secondary, and tertiary classes. Through this analysis, we aim to provide insight into the potential of ChatGPT as a math questioner. △ Less

Submitted 27 February, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: Accepted at the 39th ACM/SIGAPP Symposium On Applied Computing (SAC 2024), Main Conference

arXiv:2311.14465 [pdf, other]

DP-NMT: Scalable Differentially-Private Machine Translation

Authors: Timour Igamberdiev, Doan Nam Long Vu, Felix Künnecke, Zhuo Yu, Jannik Holmer, Ivan Habernal

Abstract: Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implemen… ▽ More Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implementation specifics of training a model with DP-SGD are not always clarified in existing models, with differing software libraries used and code bases not always being public, leading to reproducibility issues. To tackle this, we introduce DP-NMT, an open-source framework for carrying out research on privacy-preserving NMT with DP-SGD, bringing together numerous models, datasets, and evaluation metrics in one systematic software package. Our goal is to provide a platform for researchers to advance the development of privacy-preserving NMT systems, keeping the specific details of the DP-SGD algorithm transparent and intuitive to implement. We run a set of experiments on datasets from both general and privacy-related domains to demonstrate our framework in use. We make our framework publicly available and welcome feedback from the community. △ Less

Submitted 24 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: Accepted at EACL 2024

arXiv:2311.08834 [pdf, ps, other]

A* search algorithm for an optimal investment problem in vehicle-sharing systems

Authors: Ba Luat Le, Layla Martin, Emrah Demir, Duc Minh Vu

Abstract: We study an optimal investment problem that arises in the context of the vehicle-sharing system. Given a set of locations to build stations, we need to determine i) the sequence of stations to be built and the number of vehicles to acquire in order to obtain the target state where all stations are built, and ii) the number of vehicles to acquire and their allocation in order to maximize the total… ▽ More We study an optimal investment problem that arises in the context of the vehicle-sharing system. Given a set of locations to build stations, we need to determine i) the sequence of stations to be built and the number of vehicles to acquire in order to obtain the target state where all stations are built, and ii) the number of vehicles to acquire and their allocation in order to maximize the total profit returned by operating the system when some or all stations are open. The profitability associated with operating open stations, measured over a specific time period, is represented as a linear optimization problem applied to a collection of open stations. With operating capital, the owner of the system can open new stations. This property introduces a set-dependent aspect to the duration required for opening a new station, and the optimal investment problem can be viewed as a variant of the Traveling Salesman Problem (TSP) with set-dependent cost. We propose an A* search algorithm to address this particular variant of the TSP. Computational experiments highlight the benefits of the proposed algorithm in comparison to the widely recognized Dijkstra algorithm and propose future research to explore new possibilities and applications for both exact and approximate A* algorithms. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Full version of the conference paper which is accepted to be appear in the proceeding of the The 12th International Conference on Computational Data and Social Networks - SCONET2023

arXiv:2311.08111 [pdf, ps, other]

Solving Time-Dependent Traveling Salesman Problem with Time Windows under Generic Time-Dependent Travel Cost

Authors: Duc Minh Vu, Mike Hewitt, Duc Duy Vu

Abstract: In this paper, we present formulations and an exact method to solve the Time Dependent Traveling Salesman Problem with Time Window (TD-TSPTW) under a generic travel cost function where waiting is allowed. A particular case in which the travel cost is a non-decreasing function has been addressed recently. With that assumption, because of both the First-In-First-Out property of the travel time funct… ▽ More In this paper, we present formulations and an exact method to solve the Time Dependent Traveling Salesman Problem with Time Window (TD-TSPTW) under a generic travel cost function where waiting is allowed. A particular case in which the travel cost is a non-decreasing function has been addressed recently. With that assumption, because of both the First-In-First-Out property of the travel time function and the non-decreasing property of the travel cost function, we can ignore the possibility of waiting. However, for generic travel cost functions, waiting after visiting some locations can be part of optimal solutions. To handle the general case, we introduce new lower-bound formulations that allow us to ensure the existence of optimal solutions. We adapt the existing algorithm for TD-TSPTW with non-decreasing travel costs to solve the TD-TSPTW with generic travel costs. In the experiment, we evaluate the strength of the proposed lower bound formulations and algorithm by applying them to solve the TD-TSPTW with the total travel time objective. The results indicate that the proposed algorithm is competitive with and even outperforms the state-of-art solver in various benchmark instances. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: Full version with Appendix - accepted to appear in the proceeding of SCONET2003 conference - https://csonet-conf.github.io/csonet23/index.html

arXiv:2309.01656 [pdf, other]

Building Footprint Extraction in Dense Areas using Super Resolution and Frame Field Learning

Authors: Vuong Nguyen, Anh Ho, Duc-Anh Vu, Nguyen Thi Ngoc Anh, Tran Ngoc Thang

Abstract: Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial imag… ▽ More Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial image, allowing for finer details to be captured. This enhanced imagery serves as input to a multitask learning module, which consists of a segmentation head and a frame field learning head to effectively handle the irregular building structures. Our model is supervised by adaptive loss weighting, enabling extraction of sharp edges and fine-grained polygons which is difficult due to overlapping buildings and low data quality. Extensive experiments on a slum area in India that mimics a dense area demonstrate that our proposed approach significantly outperforms the current state-of-the-art methods by a large margin. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: Accepted at The 12th International Conference on Awareness Science and Technology

arXiv:2308.16139 [pdf, other]

MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

Authors: Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan Jin, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine De Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen , et al. (132 additional authors not shown)

Abstract: Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape… ▽ More Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback △ Less

Submitted 12 December, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: 16 pages

MSC Class: 68T01

arXiv:2307.10212 [pdf, other]

doi 10.1587/transfun.2020EAP1101

Capsule network with shortcut routing

Authors: Dang Thanh Vu, Vo Hoang Trong, Yu Gwang-Hyun, Kim Jin-Young

Abstract: This study introduces "shortcut routing," a novel routing mechanism in capsule networks that addresses computational inefficiencies by directly activating global capsules from local capsules, eliminating intermediate layers. An attention-based approach with fuzzy coefficients is also explored for improved efficiency. Experimental results on Mnist, smallnorb, and affNist datasets show comparable cl… ▽ More This study introduces "shortcut routing," a novel routing mechanism in capsule networks that addresses computational inefficiencies by directly activating global capsules from local capsules, eliminating intermediate layers. An attention-based approach with fuzzy coefficients is also explored for improved efficiency. Experimental results on Mnist, smallnorb, and affNist datasets show comparable classification performance, achieving accuracies of 99.52%, 93.91%, and 89.02% respectively. The proposed fuzzy-based and attention-based routing methods significantly reduce the number of calculations by 1.42 and 2.5 times compared to EM routing, highlighting their computational advantages in capsule networks. These findings contribute to the advancement of efficient and accurate hierarchical pattern representation models. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 8 pages, published at IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences E104.A(8)

arXiv:2305.08409 [pdf, other]

Validity Constraints for Data Analysis Workflows

Authors: Florian Schintke, Ninon De Mecquenem, David Frantz, Vanessa Emanuela Guarino, Marcus Hilbrich, Fabian Lehmann, Rebecca Sattler, Jan Arne Sparka, Daniel Speckhard, Hermann Stolte, Anh Duc Vu, Ulf Leser

Abstract: Porting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software stack, or even only a new dataset with some notably different properties is often challenging. Despite the structured definition of the steps (tasks) and their interdependencies during a complex data analysis in the DAW specification, relevant assumptions may remain unspecified and implicit. Such hidden a… ▽ More Porting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software stack, or even only a new dataset with some notably different properties is often challenging. Despite the structured definition of the steps (tasks) and their interdependencies during a complex data analysis in the DAW specification, relevant assumptions may remain unspecified and implicit. Such hidden assumptions often lead to crashing tasks without a reasonable error message, poor performance in general, non-terminating executions, or silent wrong results of the DAW, to name only a few possible consequences. Searching for the causes of such errors and drawbacks in a distributed compute cluster managed by a complex infrastructure stack, where DAWs for large datasets typically are executed, can be tedious and time-consuming. We propose validity constraints (VCs) as a new concept for DAW languages to alleviate this situation. A VC is a constraint specifying some logical conditions that must be fulfilled at certain times for DAW executions to be valid. When defined together with a DAW, VCs help to improve the portability, adaptability, and reusability of DAWs by making implicit assumptions explicit. Once specified, VC can be controlled automatically by the DAW infrastructure, and violations can lead to meaningful error messages and graceful behaviour (e.g., termination or invocation of repair mechanisms). We provide a broad list of possible VCs, classify them along multiple dimensions, and compare them to similar concepts one can find in related fields. We also provide a first sketch for VCs' implementation into existing DAW infrastructures. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2305.05198 [pdf, other]

doi 10.1145/3581998

Voicify Your UI: Towards Android App Control with Voice Commands

Authors: Minh Duc Vu, Han Wang, Zhuang Li, Gholamreza Haffari, Zhenchang Xing, Chunyang Chen

Abstract: Nowadays, voice assistants help users complete tasks on the smartphone with voice commands, replacing traditional touchscreen interactions when such interactions are inhibited. However, the usability of those tools remains moderate due to the problems in understanding rich language variations in human commands, along with efficiency and comprehensibility issues. Therefore, we introduce Voicify, an… ▽ More Nowadays, voice assistants help users complete tasks on the smartphone with voice commands, replacing traditional touchscreen interactions when such interactions are inhibited. However, the usability of those tools remains moderate due to the problems in understanding rich language variations in human commands, along with efficiency and comprehensibility issues. Therefore, we introduce Voicify, an Android virtual assistant that allows users to interact with on-screen elements in mobile apps through voice commands. Using a novel deep learning command parser, Voicify interprets human verbal input and performs matching with UI elements. In addition, the tool can directly open a specific feature from installed applications by fetching application code information to explore the set of in-app components. Our command parser achieved 90\% accuracy on the human command dataset. Furthermore, the direct feature invocation module achieves better feature coverage in comparison to Google Assistant. The user study demonstrates the usefulness of Voicify in real-world scenarios. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Journal ref: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, no. 1 (2023): 1-22

arXiv:2303.06274 [pdf]

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, Jinxi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

Abstract: Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2301.03418 [pdf, other]

doi 10.1007/978-3-031-21014-3_26

Nuclear Segmentation and Classification: On Color & Compression Generalization

Authors: Quoc Dang Vu, Robert Jewsbury, Simon Graham, Mostafa Jahanifar, Shan E Ahmed Raza, Fayyaz Minhas, Abhir Bhalerao, Nasir Rajpoot

Abstract: Since the introduction of digital and computational pathology as a field, one of the major problems in the clinical application of algorithms has been the struggle to generalize well to examples outside the distribution of the training data. Existing work to address this in both pathology and natural images has focused almost exclusively on classification tasks. We explore and evaluate the robustn… ▽ More Since the introduction of digital and computational pathology as a field, one of the major problems in the clinical application of algorithms has been the struggle to generalize well to examples outside the distribution of the training data. Existing work to address this in both pathology and natural images has focused almost exclusively on classification tasks. We explore and evaluate the robustness of the 7 best performing nuclear segmentation and classification models from the largest computational pathology challenge for this problem to date, the CoNIC challenge. We demonstrate that existing state-of-the-art (SoTA) models are robust towards compression artifacts but suffer substantial performance reduction when subjected to shifts in the color domain. We find that using stain normalization to address the domain shift problem can be detrimental to the model performance. On the other hand, neural style transfer is more consistent in improving test performance when presented with large color variations in the wild. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: Oral presentation at MICCAI MLMI 2022, 7 pages, 6 figures

arXiv:2211.12744 [pdf, other]

doi 10.1109/BigData55660.2022.10020864

Towards Advanced Monitoring for Scientific Workflows

Authors: Jonathan Bader, Joel Witzke, Soeren Becker, Ansgar Lößer, Fabian Lehmann, Leon Doehler, Anh Duc Vu, Odej Kao

Abstract: Scientific workflows consist of thousands of highly parallelized tasks executed in a distributed environment involving many components. Automatic tracing and investigation of the components' and tasks' performance metrics, traces, and behavior are necessary to support the end user with a level of abstraction since the large amount of data cannot be analyzed manually. The execution and monitoring o… ▽ More Scientific workflows consist of thousands of highly parallelized tasks executed in a distributed environment involving many components. Automatic tracing and investigation of the components' and tasks' performance metrics, traces, and behavior are necessary to support the end user with a level of abstraction since the large amount of data cannot be analyzed manually. The execution and monitoring of scientific workflows involves many components, the cluster infrastructure, its resource manager, the workflow, and the workflow tasks. All components in such an execution environment access different monitoring metrics and provide metrics on different abstraction levels. The combination and analysis of observed metrics from different components and their interdependencies are still widely unregarded. We specify four different monitoring layers that can serve as an architectural blueprint for the monitoring responsibilities and the interactions of components in the scientific workflow execution context. We describe the different monitoring metrics subject to the four layers and how the layers interact. Finally, we examine five state-of-the-art scientific workflow management systems (SWMS) in order to assess which steps are needed to enable our four-layer-based approach. △ Less

Submitted 18 July, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: Paper accepted in 2022 IEEE International Conference on Big Data Workshop SCDM 2022

arXiv:2209.13288 [pdf, other]

doi 10.1109/ICSE48619.2023.00052

A Benchmark Comparison of Python Malware Detection Approaches

Authors: Duc-Ly Vu, Zachary Newman, John Speed Meyers

Abstract: While attackers often distribute malware to victims via open-source, community-driven package repositories, these repositories do not currently run automated malware detection systems. In this work, we explore the security goals of the repository administrators and the requirements for deployments of such malware scanners via a case study of the Python ecosystem and PyPI repository, which includes… ▽ More While attackers often distribute malware to victims via open-source, community-driven package repositories, these repositories do not currently run automated malware detection systems. In this work, we explore the security goals of the repository administrators and the requirements for deployments of such malware scanners via a case study of the Python ecosystem and PyPI repository, which includes interviews with administrators and maintainers. Further, we evaluate existing malware detection techniques for deployment in this setting by creating a benchmark dataset and comparing several existing tools, including the malware checks implemented in PyPI, Bandit4Mal, and OSSGadget's OSS Detect Backdoor. We find that repository administrators have exacting technical demands for such malware detection tools. Specifically, they consider a false positive rate of even 0.01% to be unacceptably high, given the large number of package releases that might trigger false alerts. Measured tools have false positive rates between 15% and 97%; increasing thresholds for detection rules to reduce this rate renders the true positive rate useless. In some cases, these checks emitted alerts more often for benign packages than malicious ones. However, we also find a successful socio-technical malware detection system: external security researchers also perform repository malware scans and report the results to repository administrators. These parties face different incentives and constraints on their time and tooling. We conclude with recommendations for improving detection capabilities and strengthening the collaboration between security researchers and software repository administrators. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: 12 pages, 3 figures, 3 tables

arXiv:2209.02317 [pdf, other]

Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?

Authors: Doan Nam Long Vu, Nafise Sadat Moosavi, Steffen Eger

Abstract: The evaluation of recent embedding-based evaluation metrics for text generation is primarily based on measuring their correlation with human evaluations on standard benchmarks. However, these benchmarks are mostly from similar domains to those used for pretraining word embeddings. This raises concerns about the (lack of) generalization of embedding-based metrics to new and noisy domains that conta… ▽ More The evaluation of recent embedding-based evaluation metrics for text generation is primarily based on measuring their correlation with human evaluations on standard benchmarks. However, these benchmarks are mostly from similar domains to those used for pretraining word embeddings. This raises concerns about the (lack of) generalization of embedding-based metrics to new and noisy domains that contain a different vocabulary than the pretraining data. In this paper, we examine the robustness of BERTScore, one of the most popular embedding-based metrics for text generation. We show that (a) an embedding-based metric that has the highest correlation with human evaluations on a standard benchmark can have the lowest correlation if the amount of input noise or unknown tokens increases, (b) taking embeddings from the first layer of pretrained models improves the robustness of all metrics, and (c) the highest robustness is achieved when using character-level embeddings, instead of token-based embeddings, from the first layer of the pretrained model. △ Less

Submitted 7 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: COLING 2022 camera-ready version

arXiv:2209.01311 [pdf, other]

A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition

Authors: Duc-Quang Vu, Trang Phung, Jia-Ching Wang

Abstract: Knowledge distillation is an effective transfer of knowledge from a heavy network (teacher) to a small network (student) to boost students' performance. Self-knowledge distillation, the special case of knowledge distillation, has been proposed to remove the large teacher network training process while preserving the student's performance. This paper introduces a novel Self-knowledge distillation a… ▽ More Knowledge distillation is an effective transfer of knowledge from a heavy network (teacher) to a small network (student) to boost students' performance. Self-knowledge distillation, the special case of knowledge distillation, has been proposed to remove the large teacher network training process while preserving the student's performance. This paper introduces a novel Self-knowledge distillation approach via Siamese representation learning, which minimizes the difference between two representation vectors of the two different views from a given sample. Our proposed method, SKD-SRL, utilizes both soft label distillation and the similarity of representation vectors. Therefore, SKD-SRL can generate more consistent predictions and representations in various views of the same data point. Our benchmark has been evaluated on various standard datasets. The experimental results have shown that SKD-SRL significantly improves the accuracy compared to existing supervised learning and knowledge distillation methods regardless of the networks. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Comments: VCIP 2021

arXiv:2208.11052 [pdf, other]

IMPaSh: A Novel Domain-shift Resistant Representation for Colorectal Cancer Tissue Classification

Authors: Trinh Thi Le Vuong, Quoc Dang Vu, Mostafa Jahanifar, Simon Graham, Jin Tae Kwak, Nasir Rajpoot

Abstract: The appearance of histopathology images depends on tissue type, staining and digitization procedure. These vary from source to source and are the potential causes for domain-shift problems. Owing to this problem, despite the great success of deep learning models in computational pathology, a model trained on a specific domain may still perform sub-optimally when we apply them to another domain. To… ▽ More The appearance of histopathology images depends on tissue type, staining and digitization procedure. These vary from source to source and are the potential causes for domain-shift problems. Owing to this problem, despite the great success of deep learning models in computational pathology, a model trained on a specific domain may still perform sub-optimally when we apply them to another domain. To overcome this, we propose a new augmentation called PatchShuffling and a novel self-supervised contrastive learning framework named IMPaSh for pre-training deep learning models. Using these, we obtained a ResNet50 encoder that can extract image representation resistant to domain-shift. We compared our derived representation against those acquired based on other domain-generalization techniques by using them for the cross-domain classification of colorectal tissue images. We show that the proposed method outperforms other traditional histology domain-adaptation and state-of-the-art self-supervised learning methods. Code is available at: https://github.com/trinhvg/IMPash . △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: Accepted in ECCV2022 MCV Workshop

arXiv:2204.10717 [pdf, ps, other]

doi 10.1109/kse53942.2021.9648724

A Summary of the ALQAC 2021 Competition

Authors: Nguyen Ha Thanh, Bui Minh Quan, Chau Nguyen, Tung Le, Nguyen Minh Phuong, Dang Tran Binh, Vuong Thi Hai Yen, Teeradaj Racharak, Nguyen Le Minh, Tran Duc Vu, Phan Viet Anh, Nguyen Truong Son, Huy Tien Nguyen, Bhumindr Butr-indr, Peerapon Vateekul, Prachya Boonkwan

Abstract: We summarize the evaluation of the first Automated Legal Question Answering Competition (ALQAC 2021). The competition this year contains three tasks, which aims at processing the statute law document, which are Legal Text Information Retrieval (Task 1), Legal Text Entailment Prediction (Task 2), and Legal Text Question Answering (Task 3). The final goal of these tasks is to build a system that can… ▽ More We summarize the evaluation of the first Automated Legal Question Answering Competition (ALQAC 2021). The competition this year contains three tasks, which aims at processing the statute law document, which are Legal Text Information Retrieval (Task 1), Legal Text Entailment Prediction (Task 2), and Legal Text Question Answering (Task 3). The final goal of these tasks is to build a system that can automatically determine whether a particular statement is lawful. There is no limit to the approaches of the participating teams. This year, there are 5 teams participating in Task 1, 6 teams participating in Task 2, and 5 teams participating in Task 3. There are in total 36 runs submitted to the organizer. In this paper, we summarize each team's approaches, official results, and some discussion about the competition. Only results of the teams who successfully submit their approach description paper are reported in this paper. △ Less

Submitted 24 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

arXiv:2203.09035 [pdf]

HybridNets: End-to-End Perception Network

Authors: Dat Vu, Bao Ngo, Hung Phan

Abstract: End-to-end Network has become increasingly important in multi-tasking. One prominent example of this is the growing significance of a driving perception system in autonomous driving. This paper systematically studies an end-to-end perception network for multi-tasking and proposes several key optimizations to improve accuracy. First, the paper proposes efficient segmentation head and box/class pred… ▽ More End-to-end Network has become increasingly important in multi-tasking. One prominent example of this is the growing significance of a driving perception system in autonomous driving. This paper systematically studies an end-to-end perception network for multi-tasking and proposes several key optimizations to improve accuracy. First, the paper proposes efficient segmentation head and box/class prediction networks based on weighted bidirectional feature network. Second, the paper proposes automatically customized anchor for each level in the weighted bidirectional feature network. Third, the paper proposes an efficient training loss function and training strategy to balance and optimize network. Based on these optimizations, we have developed an end-to-end perception network to perform multi-tasking, including traffic object detection, drivable area segmentation and lane detection simultaneously, called HybridNets, which achieves better accuracy than prior art. In particular, HybridNets achieves 77.3 mean Average Precision on Berkeley DeepDrive Dataset, outperforms lane detection with 31.6 mean Intersection Over Union with 12.83 million parameters and 15.6 billion floating-point operations. In addition, it can perform visual perception tasks in real-time and thus is a practical and accurate solution to the multi-tasking problem. Code is available at https://github.com/datvuthanh/HybridNets. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.03724 [pdf, other]

A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Authors: Francois St-Hilaire, Dung Do Vu, Antoine Frau, Nathan Burns, Farid Faraji, Joseph Potochny, Stephane Robert, Arnaud Roussel, Selene Zheng, Taylor Glazier, Junfel Vincent Romano, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Tommy Delarosbil, Seulmin Ahn, Simon Eden-Walker, Kritika Sony, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Victor Chen, Hossein Sahraei, Robert Larson , et al. (6 additional authors not shown)

Abstract: Despite artificial intelligence (AI) having transformed major aspects of our society, less than a fraction of its potential has been explored, let alone deployed, for education. AI-powered learning can provide millions of learners with a highly personalized, active and practical learning experience, which is key to successful learning. This is especially relevant in the context of online learning… ▽ More Despite artificial intelligence (AI) having transformed major aspects of our society, less than a fraction of its potential has been explored, let alone deployed, for education. AI-powered learning can provide millions of learners with a highly personalized, active and practical learning experience, which is key to successful learning. This is especially relevant in the context of online learning platforms. In this paper, we present the results of a comparative head-to-head study on learning outcomes for two popular online learning platforms (n=199 participants): A MOOC platform following a traditional model delivering content using lecture videos and multiple-choice quizzes, and the Korbit learning platform providing a highly personalized, active and practical learning experience. We observe a huge and statistically significant increase in the learning outcomes, with students on the Korbit platform providing full feedback resulting in higher course completion rates and achieving learning gains 2 to 2.5 times higher than both students on the MOOC platform and students in a control group who don't receive personalized feedback on the Korbit platform. The results demonstrate the tremendous impact that can be achieved with a personalized, active learning AI-powered system. Making this technology and learning experience available to millions of learners around the world will represent a significant leap forward towards the democratization of education. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 9 pages, 6 figures

ACM Class: I.2.0; K.3.1; K.4.0

arXiv:2203.00077 [pdf, other]

One Model is All You Need: Multi-Task Learning Enables Simultaneous Histology Image Segmentation and Classification

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Shan E Ahmed Raza, Fayyaz Minhas, David Snead, Nasir Rajpoot

Abstract: The recent surge in performance for image analysis of digitised pathology slides can largely be attributed to the advances in deep learning. Deep models can be used to initially localise various structures in the tissue and hence facilitate the extraction of interpretable features for biomarker discovery. However, these models are typically trained for a single task and therefore scale poorly as w… ▽ More The recent surge in performance for image analysis of digitised pathology slides can largely be attributed to the advances in deep learning. Deep models can be used to initially localise various structures in the tissue and hence facilitate the extraction of interpretable features for biomarker discovery. However, these models are typically trained for a single task and therefore scale poorly as we wish to adapt the model for an increasing number of different tasks. Also, supervised deep learning models are very data hungry and therefore rely on large amounts of training data to perform well. In this paper, we present a multi-task learning approach for segmentation and classification of nuclei, glands, lumina and different tissue regions that leverages data from multiple independent data sources. While ensuring that our tasks are aligned by the same tissue type and resolution, we enable meaningful simultaneous prediction with a single network. As a result of feature sharing, we also show that the learned representation can be used to improve the performance of additional tasks via transfer learning, including nuclear classification and signet ring cell detection. As part of this work, we train our developed Cerberus model on a huge amount of data, consisting of over 600K objects for segmentation and 440K patches for classification. We use our approach to process 599 colorectal whole-slide images from TCGA, where we localise 377 million, 900K and 2.1 million nuclei, glands and lumina, respectively and make the results available to the community for downstream analysis. △ Less

Submitted 14 November, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

arXiv:2202.07001 [pdf, other]

Handcrafted Histological Transformer (H2T): Unsupervised Representation of Whole Slide Images

Authors: Quoc Dang Vu, Kashif Rajpoot, Shan E Ahmed Raza, Nasir Rajpoot

Abstract: Diagnostic, prognostic and therapeutic decision-making of cancer in pathology clinics can now be carried out based on analysis of multi-gigapixel tissue images, also known as whole-slide images (WSIs). Recently, deep convolutional neural networks (CNNs) have been proposed to derive unsupervised WSI representations; these are attractive as they rely less on expert annotation which is cumbersome. Ho… ▽ More Diagnostic, prognostic and therapeutic decision-making of cancer in pathology clinics can now be carried out based on analysis of multi-gigapixel tissue images, also known as whole-slide images (WSIs). Recently, deep convolutional neural networks (CNNs) have been proposed to derive unsupervised WSI representations; these are attractive as they rely less on expert annotation which is cumbersome. However, a major trade-off is that higher predictive power generally comes at the cost of interpretability, posing a challenge to their clinical use where transparency in decision-making is generally expected. To address this challenge, we present a handcrafted framework based on deep CNN for constructing holistic WSI-level representations. Building on recent findings about the internal working of the Transformer in the domain of natural language processing, we break down its processes and handcraft them into a more transparent framework that we term as the Handcrafted Histological Transformer or H2T. Based on our experiments involving various datasets consisting of a total of 5,306 WSIs, the results demonstrate that H2T based holistic WSI-level representations offer competitive performance compared to recent state-of-the-art methods and can be readily utilized for various downstream analysis tasks. Finally, our results demonstrate that the H2T framework can be up to 14 times faster than the Transformer models. △ Less

Submitted 6 September, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

arXiv:2201.02985 [pdf, other]

Routing in an Uncertain World: Adaptivity, Efficiency, and Equilibrium

Authors: Dong Quan Vu, Kimon Antonakopoulos, Panayotis Mertikopoulos

Abstract: We consider the traffic assignment problem in nonatomic routing games where the players' cost functions may be subject to random fluctuations (e.g., weather disturbances, perturbations in the underlying network, etc.). We tackle this problem from the viewpoint of a control interface that makes routing recommendations based solely on observed costs and without any further knowledge of the system's… ▽ More We consider the traffic assignment problem in nonatomic routing games where the players' cost functions may be subject to random fluctuations (e.g., weather disturbances, perturbations in the underlying network, etc.). We tackle this problem from the viewpoint of a control interface that makes routing recommendations based solely on observed costs and without any further knowledge of the system's governing dynamics -- such as the network's cost functions, the distribution of any random events affecting the network, etc. In this online setting, learning methods based on the popular exponential weights algorithm converge to equilibrium at an $\mathcal{O}({1/\sqrt{T}})$ rate: this rate is known to be order-optimal in stochastic networks, but it is otherwise suboptimal in static networks. In the latter case, it is possible to achieve an $\mathcal{O}({1/T^{2}})$ equilibrium convergence rate via the use of finely tuned accelerated algorithms; on the other hand, these accelerated algorithms fail to converge altogether in the presence of persistent randomness, so it is not clear how to achieve the "best of both worlds" in terms of convergence speed. Our paper seeks to fill this gap by proposing an adaptive routing algortihm with the following desirable properties: $(i)$ it seamlessly interpolates between the $\mathcal{O}({1/T^{2}})$ and $\mathcal{O}({1/\sqrt{T}})$ rates for static and stochastic environments respectively; $(ii)$ its convergence speed is polylogarithmic in the number of paths in the network; ${(iii)}$ the method's per-iteration complexity and memory requirements are both linear in the number of nodes and edges in the network; and ${(iv)}$ it does not require any prior knowledge of the problem's parameters. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2111.14485 [pdf, other]

CoNIC: Colon Nuclei Identification and Counting Challenge 2022

Authors: Simon Graham, Mostafa Jahanifar, Quoc Dang Vu, Giorgos Hadjigeorghiou, Thomas Leech, David Snead, Shan E Ahmed Raza, Fayyaz Minhas, Nasir Rajpoot

Abstract: Nuclear segmentation, classification and quantification within Haematoxylin & Eosin stained histology images enables the extraction of interpretable cell-based features that can be used in downstream explainable models in computational pathology (CPath). However, automatic recognition of different nuclei is faced with a major challenge in that there are several different types of nuclei, some of t… ▽ More Nuclear segmentation, classification and quantification within Haematoxylin & Eosin stained histology images enables the extraction of interpretable cell-based features that can be used in downstream explainable models in computational pathology (CPath). However, automatic recognition of different nuclei is faced with a major challenge in that there are several different types of nuclei, some of them exhibiting large intra-class variability. To help drive forward research and innovation for automatic nuclei recognition in CPath, we organise the Colon Nuclei Identification and Counting (CoNIC) Challenge. The challenge encourages researchers to develop algorithms that perform segmentation, classification and counting of nuclei within the current largest known publicly available nuclei-level dataset in CPath, containing around half a million labelled nuclei. Therefore, the CoNIC challenge utilises over 10 times the number of nuclei as the previous largest challenge dataset for nuclei recognition. It is important for algorithms to be robust to input variation if we wish to deploy them in a clinical setting. Therefore, as part of this challenge we will also test the sensitivity of each submitted algorithm to certain input variations. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2110.05153 [pdf, other]

Decentralized sliding-mode control laws for the bearing-based formation tracking problem

Authors: Dung Van Vu, Minh Hoang Trinh

Abstract: This paper studies the time-varying bearing-based tracking of leader-follower formations. The desired constraints between agents are specified by bearing vectors, and several leaders are moving with a bounded reference velocity. Each followers can measure the relative positions of its neighbors, its own velocities, and receive information from their neighbors. Under the assumptions that the desire… ▽ More This paper studies the time-varying bearing-based tracking of leader-follower formations. The desired constraints between agents are specified by bearing vectors, and several leaders are moving with a bounded reference velocity. Each followers can measure the relative positions of its neighbors, its own velocities, and receive information from their neighbors. Under the assumptions that the desired formation is infinitesimally bearing rigid and the local reference frames of followers are aligned with each other, two control laws are presented in this paper based on sliding mode control approach. Stability analyses are given based on Lyapunov stability theory and supported by numerical simulations. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: Accepted to ICCAIS 2021

arXiv:2106.00617 [pdf, other]

Colonel Blotto Games with Favoritism: Competitions with Pre-allocations and Asymmetric Effectiveness

Authors: Dong Quan Vu, Patrick Loiseau

Abstract: We introduce the Colonel Blotto game with favoritism, an extension of the famous Colonel Blotto game where the winner-determination rule is generalized to include pre-allocations and asymmetry of the players' resources effectiveness on each battlefield. Such favoritism is found in many classical applications of the Colonel Blotto game. We focus on the Nash equilibrium. First, we consider the close… ▽ More We introduce the Colonel Blotto game with favoritism, an extension of the famous Colonel Blotto game where the winner-determination rule is generalized to include pre-allocations and asymmetry of the players' resources effectiveness on each battlefield. Such favoritism is found in many classical applications of the Colonel Blotto game. We focus on the Nash equilibrium. First, we consider the closely related model of all-pay auctions with favoritism and completely characterize its equilibrium. Based on this result, we prove the existence of a set of optimal univariate distributions -- which serve as candidate marginals for an equilibrium -- of the Colonel Blotto game with favoritism and show an explicit construction thereof. In several particular cases, this directly leads to an equilibrium of the Colonel Blotto game with favoritism. In other cases, we use these optimal univariate distributions to derive an approximate equilibrium with well-controlled approximation error. Finally, we propose an algorithm -- based on the notion of winding number in parametric curves -- to efficiently compute an approximation of the proposed optimal univariate distributions with arbitrarily small error. △ Less

Submitted 1 June, 2021; originally announced June 2021.

Comments: Full paper version of ACM EC conference paper: https://doi.org/10.1145/3465456.3467527

arXiv:2104.08126 [pdf, other]

Exploiting Global and Local Attentions for Heavy Rain Removal on Single Images

Authors: Dac Tung Vu, Juan Luis Gonzalez, Munchurl Kim

Abstract: Heavy rain removal from a single image is the task of simultaneously eliminating rain streaks and fog, which can dramatically degrade the quality of captured images. Most existing rain removal methods do not generalize well for the heavy rain case. In this work, we propose a novel network architecture consisting of three sub-networks to remove heavy rain from a single image without estimating rain… ▽ More Heavy rain removal from a single image is the task of simultaneously eliminating rain streaks and fog, which can dramatically degrade the quality of captured images. Most existing rain removal methods do not generalize well for the heavy rain case. In this work, we propose a novel network architecture consisting of three sub-networks to remove heavy rain from a single image without estimating rain streaks and fog separately. The first sub-net, a U-net-based architecture that incorporates our Spatial Channel Attention (SCA) blocks, extracts global features that provide sufficient contextual information needed to remove atmospheric distortions caused by rain and fog. The second sub-net learns the additive residues information, which is useful in removing rain streak artifacts via our proposed Residual Inception Modules (RIM). The third sub-net, the multiplicative sub-net, adopts our Channel-attentive Inception Modules (CIM) and learns the essential brighter local features which are not effectively extracted in the SCA and additive sub-nets by modulating the local pixel intensities in the derained images. Our three clean image results are then combined via an attentive blending block to generate the final clean image. Our method with SCA, RIM, and CIM significantly outperforms the previous state-of-the-art single-image deraining methods on the synthetic datasets, shows considerably cleaner and sharper derained estimates on the real image datasets. We present extensive experiments and ablation studies supporting each of our method's contributions on both synthetic and real image datasets. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2104.07763 [pdf, other]

Comparative Study of Learning Outcomes for Online Learning Platforms

Authors: Francois St-Hilaire, Nathan Burns, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Dung Do Vu, Antoine Frau, Joseph Potochny, Farid Faraji, Vincent Pavero, Neroli Ko, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Laurent Charlin, Yoshua Bengio, Iulian Vlad Serban, Ekaterina Kochmar

Abstract: Personalization and active learning are key aspects to successful learning. These aspects are important to address in intelligent educational applications, as they help systems to adapt and close the gap between students with varying abilities, which becomes increasingly important in the context of online and distance learning. We run a comparative head-to-head study of learning outcomes for two p… ▽ More Personalization and active learning are key aspects to successful learning. These aspects are important to address in intelligent educational applications, as they help systems to adapt and close the gap between students with varying abilities, which becomes increasingly important in the context of online and distance learning. We run a comparative head-to-head study of learning outcomes for two popular online learning platforms: Platform A, which follows a traditional model delivering content over a series of lecture videos and multiple-choice quizzes, and Platform B, which creates a personalized learning environment and provides problem-solving exercises and personalized feedback. We report on the results of our study using pre- and post-assessment quizzes with participants taking courses on an introductory data science topic on two platforms. We observe a statistically significant increase in the learning outcomes on Platform B, highlighting the impact of well-designed and well-engineered technology supporting active learning and problem-based learning in online education. Moreover, the results of the self-assessment questionnaire, where participants reported on perceived learning gains, suggest that participants using Platform B improve their metacognition. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: 14 pages, 3 figures, 2 tables, accepted at AIED 2021 (2021 Conference on Artificial Intelligence in Education)

ACM Class: I.2.0; I.2.1; I.2.7; K.3.1; G.4

arXiv:2102.10378 [pdf, other]

Self-Supervised Learning via multi-Transformation Classification for Action Recognition

Authors: Duc Quang Vu, Ngan T. H. Le, Jia-Ching Wang

Abstract: Self-supervised tasks have been utilized to build useful representations that can be used in downstream tasks when the annotation is unavailable. In this paper, we introduce a self-supervised video representation learning method based on the multi-transformation classification to efficiently classify human actions. Self-supervised learning on various transformations not only provides richer contex… ▽ More Self-supervised tasks have been utilized to build useful representations that can be used in downstream tasks when the annotation is unavailable. In this paper, we introduce a self-supervised video representation learning method based on the multi-transformation classification to efficiently classify human actions. Self-supervised learning on various transformations not only provides richer contextual information but also enables the visual representation more robust to the transforms. The spatio-temporal representation of the video is learned in a self-supervised manner by classifying seven different transformations i.e. rotation, clip inversion, permutation, split, join transformation, color switch, frame replacement, noise addition. First, seven different video transformations are applied to video clips. Then the 3D convolutional neural networks are utilized to extract features for clips and these features are processed to classify the pseudo-labels. We use the learned models in pretext tasks as the pre-trained models and fine-tune them to recognize human actions in the downstream task. We have conducted the experiments on UCF101 and HMDB51 datasets together with C3D and 3D Resnet-18 as backbone networks. The experimental results have shown that our proposed framework is outperformed other SOTA self-supervised action recognition approaches. The code will be made publicly available. △ Less

Submitted 20 February, 2021; originally announced February 2021.

arXiv:2012.09990 [pdf, other]

An Improved Approach for Estimating Social POI Boundaries With Textual Attributes on Social Media

Authors: Cong Tran, Dung D. Vu, Won-Yong Shin

Abstract: It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media. In this paper, we aim at discovering a social point-of-interest (POI) boundary, formed as a convex polygon. More specifically, we present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst). This SoBEst approach takes into ac… ▽ More It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media. In this paper, we aim at discovering a social point-of-interest (POI) boundary, formed as a convex polygon. More specifically, we present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst). This SoBEst approach takes into account both relevant and irrelevant records within a geographic area, where relevant records contain a POI name or its variations in their text field. Our study is motivated by the following empirical observation: a fixed representative coordinate of each POI that SoBEst basically assumes may be far away from the centroid of the estimated social POI boundary for certain POIs. Thus, using SoBEst in such cases may possibly result in unsatisfactory performance on the boundary estimation quality (BEQ), which is expressed as a function of the $F$-measure. To solve this problem, we formulate a joint optimization problem of simultaneously finding the radius of a circle and the POI's representative coordinate $c$ by allowing to update $c$. Subsequently, we design an iterative SoBEst (I-SoBEst) algorithm, which enables us to achieve a higher degree of BEQ for some POIs. The computational complexity of the proposed I-SoBEst algorithm is shown to scale linearly with the number of records. We demonstrate the superiority of our algorithm over competing clustering methods including the original SoBEst. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: 13 pages, 6 figures, 5 tables; to appear in the Knowledge-Based Systems (Please cite our journal version that will appear in an upcoming issue.)

arXiv:2006.12218 [pdf, other]

Risk Communication in Asian Countries: COVID-19 Discourse on Twitter

Authors: Sungkyu Park, Sungwon Han, Jeongwook Kim, Mir Majid Molaie, Hoang Dieu Vu, Karandeep Singh, Jiyoung Han, Wonjae Lee, Meeyoung Cha

Abstract: COVID-19 has become one of the most widely talked about topics on social media. This research characterizes risk communication patterns by analyzing the public discourse on the novel coronavirus from four Asian countries: South Korea, Iran, Vietnam, and India, which suffered the outbreak to different degrees. The temporal analysis shows that the official epidemic phases issued by governments do no… ▽ More COVID-19 has become one of the most widely talked about topics on social media. This research characterizes risk communication patterns by analyzing the public discourse on the novel coronavirus from four Asian countries: South Korea, Iran, Vietnam, and India, which suffered the outbreak to different degrees. The temporal analysis shows that the official epidemic phases issued by governments do not match well with the online attention on COVID-19. This finding calls for a need to analyze the public discourse by new measures, such as topical dynamics. Here, we propose an automatic method to detect topical phase transitions and compare similarities in major topics across these countries over time. We examine the time lag difference between social media attention and confirmed patient counts. For dynamics, we find an inverse relationship between the tweet count and topical diversity. △ Less

Submitted 14 August, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: 10 pages (+ 12 pages of Appendices), 18 figures, 2 tables

arXiv:2005.06616 [pdf, other]

A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM

Authors: Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Dung D. Vu, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua Bengio

Abstract: We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlik… ▽ More We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlike other ITS, a teacher can develop new learning modules for Korbit in a matter of hours. To facilitate learning across a widerange of STEM subjects, Korbit uses a mixed-interface, which includes videos, interactive dialogue-based exercises, question-answering, conceptual diagrams, mathematical exercises and gamification elements. Korbit has been built to scale to millions of students, by utilizing a state-of-the-art cloud-based micro-service architecture. Korbit launched its first course in 2019 on machine learning, and since then over 7,000 students have enrolled. Although Korbit was designed to be open-domain and highly scalable, A/B testing experiments with real-world students demonstrate that both student learning outcomes and student motivation are substantially improved compared to typical online courses. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 6 pages, 1 figure, 1 table, accepted for publication in the 21st International Conference on Artificial Intelligence in Education (AIED 2020)

ACM Class: I.2.0; I.2.1; I.2.7; K.3.1; G.4

arXiv:2005.02431 [pdf, other]

Automated Personalized Feedback Improves Learning Gains in an Intelligent Tutoring System

Authors: Ekaterina Kochmar, Dung Do Vu, Robert Belfer, Varun Gupta, Iulian Vlad Serban, Joelle Pineau

Abstract: We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with pe… ▽ More We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with personalized hints, Wikipedia-based explanations, and mathematical hints. Our model is used in Korbit, a large-scale dialogue-based ITS with thousands of students launched in 2019, and we demonstrate that the personalized feedback leads to considerable improvement in student learning outcomes and in the subjective evaluation of the feedback. △ Less

Submitted 7 May, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: To be published in Proceedings of the the 21st International Conference on Artificial Intelligence in Education (AIED 2020)

arXiv:2004.02275 [pdf, other]

The two-echelon routing problem with truck and drones

Authors: Minh Hoàng Hà, Lam Vu, Duy Manh Vu

Abstract: In this paper, we study novel variants of the well-known two-echelon vehicle routing problem in which a truck works on the first echelon to transport parcels and a fleet of drones to intermediate depots while in the second echelon, the drones are used to deliver parcels from intermediate depots to customers. The objective is to minimize the completion time instead of the transportation cost as in… ▽ More In this paper, we study novel variants of the well-known two-echelon vehicle routing problem in which a truck works on the first echelon to transport parcels and a fleet of drones to intermediate depots while in the second echelon, the drones are used to deliver parcels from intermediate depots to customers. The objective is to minimize the completion time instead of the transportation cost as in classical 2-echelon vehicle routing problems. Depending on the context, a drone can be launched from the truck at an intermediate depot once (single trip drone) or several times (multiple trip drone). Mixed Integer Linear Programming (MILP) models are first proposed to formulate mathematically the problems and solve to optimality small-size instances. To handle larger instances, a metaheuristic based on the idea of Greedy Randomized Adaptive Search Procedure (GRASP) is introduced. Experimental results obtained on instances of different contexts are reported and analyzed. △ Less

Submitted 5 April, 2020; originally announced April 2020.

Comments: 29 pages, 4 figures

arXiv:2001.09735 [pdf]

An AI model for Rapid and Accurate Identification of Chemical Agents in Mass Casualty Incidents

Authors: Nicholas Boltin, Daniel Vu, Bethany Janos, Alyssa Shofner, Joan Culley, Homayoun Valafar

Abstract: In this report we examine the effectiveness of WISER in identification of a chemical culprit during a chemical based Mass Casualty Incident (MCI). We also evaluate and compare Binary Decision Tree (BDT) and Artificial Neural Networks (ANN) using the same experimental conditions as WISER. The reverse engineered set of Signs/Symptoms from the WISER application was used as the training set and 31,100… ▽ More In this report we examine the effectiveness of WISER in identification of a chemical culprit during a chemical based Mass Casualty Incident (MCI). We also evaluate and compare Binary Decision Tree (BDT) and Artificial Neural Networks (ANN) using the same experimental conditions as WISER. The reverse engineered set of Signs/Symptoms from the WISER application was used as the training set and 31,100 simulated patient records were used as the testing set. Three sets of simulated patient records were generated by 5%, 10% and 15% perturbation of the Signs/Symptoms of each chemical record. While all three methods achieved a 100% training accuracy, WISER, BDT and ANN produced performances in the range of: 1.8%-0%, 65%-26%, 67%-21% respectively. A preliminary investigation of dimensional reduction using ANN illustrated a dimensional collapse from 79 variables to 40 with little loss of classification performance. △ Less

Submitted 12 December, 2019; originally announced January 2020.

Comments: 7 pages, published in HIMS 2016

arXiv:1912.02945 [pdf]

doi 10.1145/3388176.3388187

A pedestrian path-planning model in accordance with obstacle's danger with reinforcement learning

Authors: Thanh-Trung Trinh, Dinh-Minh Vu, Masaomi Kimura

Abstract: Most microscopic pedestrian navigation models use the concept of "forces" applied to the pedestrian agents to replicate the navigation environment. While the approach could provide believable results in regular situations, it does not always resemble natural pedestrian navigation behaviour in many typical settings. In our research, we proposed a novel approach using reinforcement learning for simu… ▽ More Most microscopic pedestrian navigation models use the concept of "forces" applied to the pedestrian agents to replicate the navigation environment. While the approach could provide believable results in regular situations, it does not always resemble natural pedestrian navigation behaviour in many typical settings. In our research, we proposed a novel approach using reinforcement learning for simulation of pedestrian agent path planning and collision avoidance problem. The primary focus of this approach is using human perception of the environment and danger awareness of interferences. The implementation of our model has shown that the path planned by the agent shares many similarities with a human pedestrian in several aspects such as following common walking conventions and human behaviours. △ Less

Submitted 5 December, 2019; originally announced December 2019.

arXiv:1911.09023

Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek

Authors: Dong Quan Vu, Patrick Loiseau, Alonso Silva, Long Tran-Thanh

Abstract: Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path plann… ▽ More Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games. △ Less

Submitted 21 November, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: This work was as replacement for arXiv:1905.11151 but was mistakenly submitted here as a new article

arXiv:1910.06761 [pdf, other]

Causal Mechanism Transfer Network for Time Series Domain Adaptation in Mechanical Systems

Authors: Zijian Li, Ruichu Cai, Kok Soon Chai, Hong Wei Ng, Hoang Dung Vu, Marianne Winslett, Tom Z. J. Fu, Boyan Xu, Xiaoyan Yang, Zhenjie Zhang

Abstract: Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry to massive labels over historical data, mostly contributed by human engineers at an… ▽ More Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry to massive labels over historical data, mostly contributed by human engineers at an extremely high cost. The label demand is now the major limiting factor to modeling accuracy, hindering the fulfillment of visions for applications. Fortunately, domain adaptation enhances the model generalization by utilizing the labelled source data as well as the unlabelled target data and then we can reuse the model on different domains. However, the mainstream domain adaptation methods cannot achieve ideal performance on time series data, because most of them focus on static samples and even the existing time series domain adaptation methods ignore the properties of time series data, such as temporal causal mechanism. In this paper, we assume that causal mechanism is invariant and present our Causal Mechanism Transfer Network(CMTN) for time series domain adaptation. By capturing and transferring the dynamic and temporal causal mechanism of multivariate time series data and alleviating the time lags and different value ranges among different machines, CMTN allows the data-driven models to exploit existing data and labels from similar systems, such that the resulting model on a new system is highly reliable even with very limited data. We report our empirical results and lessons learned from two real-world case studies, on chiller plant energy optimization and boiler fault detection, which outperforms the existing state-of-the-art method. △ Less

Submitted 13 October, 2019; originally announced October 2019.

arXiv:1910.06559 [pdf, ps, other]

Approximate Equilibria in Generalized Colonel Blotto and Generalized Lottery Blotto Games

Authors: Dong Quan Vu, Patrick Loiseau, Alonso Silva

Abstract: In the Colonel Blotto game, two players with a fixed budget simultaneously allocate their resources across n battlefields to maximize the aggregate value gained from the battlefields where they have the higher allocation. Despite its long-standing history and important applications, the Colonel Blotto game still lacks a complete Nash equilibrium characterization in its most general form where play… ▽ More In the Colonel Blotto game, two players with a fixed budget simultaneously allocate their resources across n battlefields to maximize the aggregate value gained from the battlefields where they have the higher allocation. Despite its long-standing history and important applications, the Colonel Blotto game still lacks a complete Nash equilibrium characterization in its most general form where players are asymmetric and battlefields' values are heterogeneous across battlefields and different between the two players---this is called the Generalized Colonel Blotto game. In this work, we propose a simply-constructed class of strategies---the independently uniform strategies---and we prove that they are approximate equilibria of the Generalized Colonel Blotto game; moreover, we characterize the approximation error according to the game's parameters. We also consider an extension called the Generalized Lottery Blotto game, with stochastic winner-determination rules allowing more flexibility in modeling practical contests. We prove that the proposed strategies are also approximate equilibria of the Generalized Lottery Blotto game. △ Less

Submitted 3 November, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

arXiv:1909.07227 [pdf, other]

A Convolutional Transformation Network for Malware Classification

Authors: Duc-Ly Vu, Trong-Kha Nguyen, Tam V. Nguyen, Tu N. Nguyen, Fabio Massacci, Phu H. Phung

Abstract: Modern malware evolves various detection avoidance techniques to bypass the state-of-the-art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning techniques to classify and detect malware. However, existing works in this field only perform simple image transformation methods that limit the accuracy of the detection. In this pa… ▽ More Modern malware evolves various detection avoidance techniques to bypass the state-of-the-art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning techniques to classify and detect malware. However, existing works in this field only perform simple image transformation methods that limit the accuracy of the detection. In this paper, we introduce a novel approach to classify malware by using a deep network on images transformed from binary samples. In particular, we first develop a novel hybrid image transformation method to convert binaries into color images that convey the binary semantics. The images are trained by a deep convolutional neural network that later classifies the test inputs into benign or malicious categories. Through the extensive experiments, our proposed method surpasses all baselines and achieves 99.14% in terms of accuracy on the testing set. △ Less

Submitted 16 September, 2019; originally announced September 2019.

Comments: 6 pages, 4 figures

arXiv:1909.04912 [pdf, ps, other]

Combinatorial Bandits for Sequential Learning in Colonel Blotto Games

Authors: Dong Quan Vu, Patrick Loiseau, Alonso Silva

Abstract: The Colonel Blotto game is a renowned resource allocation problem with a long-standing literature in game theory (almost 100 years). However, its scope of application is still restricted by the lack of studies on the incomplete-information situations where a learning model is needed. In this work, we propose and study a regret-minimization model where a learner repeatedly plays the Colonel Blotto… ▽ More The Colonel Blotto game is a renowned resource allocation problem with a long-standing literature in game theory (almost 100 years). However, its scope of application is still restricted by the lack of studies on the incomplete-information situations where a learning model is needed. In this work, we propose and study a regret-minimization model where a learner repeatedly plays the Colonel Blotto game against several adversaries. At each stage, the learner distributes her budget of resources on a fixed number of battlefields to maximize the aggregate value of battlefields she wins; each battlefield being won if there is no adversary that has higher allocation. We focus on the bandit feedback setting. We first show that it can be modeled as a path planning problem. It is then possible to use the classical COMBAND algorithm to guarantee a sub-linear regret in terms of time horizon, but this entails two fundamental challenges: (i) the computation is inefficient due to the huge size of the action set, and (ii) the standard exploration distribution leads to a loose guarantee in practice. To address the first, we construct a modified algorithm that can be efficiently implemented by applying a dynamic programming technique called weight pushing; for the second, we propose methods optimizing the exploration distribution to improve the regret bound. Finally, we implement our proposed algorithm and perform numerical experiments that show the regret improvement in practice. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Journal ref: 58th IEEE Conference on Decision and Control (CDC), Dec 2019, Nice, France

arXiv:1905.11151 [pdf, ps, other]

Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek

Authors: Dong Quan Vu, Patrick Loiseau, Alonso Silva, Long Tran-Thanh

Abstract: Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path plann… ▽ More Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games. △ Less

Submitted 21 November, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: Previously, this work appeared as arXiv:1911.09023 which was mistakenly submitted as a new article (has been submitted to be withdrawn). This is a preprint of the work published in Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI)

arXiv:1905.06112 [pdf]

Detecting Vietnamese Opinion Spam

Authors: T. H. H Duong, T. D. Vu, V. M. Ngo

Abstract: Recently, Vietnamese Natural Language Processing has been researched by experts in academic and business. However, the existing papers have been focused only on information classification or extraction from documents. Nowadays, with quickly development of the e-commerce websites, forums and social networks, the products, people, organizations or wonders are targeted of comments or reviews of the n… ▽ More Recently, Vietnamese Natural Language Processing has been researched by experts in academic and business. However, the existing papers have been focused only on information classification or extraction from documents. Nowadays, with quickly development of the e-commerce websites, forums and social networks, the products, people, organizations or wonders are targeted of comments or reviews of the network communities. Many people often use that reviews to make their decision on something. Whereas, there are many people or organizations use the reviews to mislead readers. Therefore, it is so necessary to detect those bad behaviors in reviews. In this paper, we research this problem and propose an appropriate method for detecting Vietnamese reviews being spam or non-spam. The accuracy of our method is up to 90%. △ Less

Submitted 9 May, 2019; originally announced May 2019.

Comments: 6 pages, in Vietnamese

Journal ref: ICTFIT 2012

Showing 1–50 of 61 results for author: Vu, D