Skip to main content

Showing 1–50 of 632 results for author: Wu, D

  1. arXiv:2407.02208  [pdf, other

    cs.CL cs.AI

    How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise on Machine Translation

    Authors: Yan Meng, Di Wu, Christof Monz

    Abstract: The massive amounts of web-mined parallel data contain large amounts of noise. Semantic misalignment, as the primary source of the noise, poses a challenge for training machine translation systems. In this paper, we first study the impact of real-world hard-to-detect misalignment noise by proposing a process to simulate the realistic misalignment controlled by semantic similarity. After quantitati… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.01511  [pdf, other

    cs.AI

    CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

    Authors: Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian, Philip Torr, Bernard Ghanem, Guohao Li

    Abstract: The development of autonomous agents increasingly relies on Multimodal Language Models (MLMs) to perform tasks described in natural language with GUI environments, such as websites, desktop computers, or mobile phones. Existing benchmarks for MLM agents in interactive environments are limited by their focus on a single environment, lack of detailed and generalized evaluation methods, and the compl… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00610  [pdf, other

    cs.LG

    Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

    Authors: Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu

    Abstract: Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a smal… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2407.00377  [pdf, other

    cs.CL cs.AI cs.CV cs.CY

    The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention

    Authors: Yixin Wan, Di Wu, Haoran Wang, Kai-Wei Chang

    Abstract: Prompt-based "diversity interventions" are commonly adopted to improve the diversity of Text-to-Image (T2I) models depicting individuals with various racial or gender traits. However, will this strategy result in nonfactual demographic distribution, especially when generating real historical figures? In this work, we propose DemOgraphic FActualIty Representation (DoFaiR), a benchmark to systematic… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  5. arXiv:2407.00191  [pdf, other

    cs.CL

    MetaKP: On-Demand Keyphrase Generation

    Authors: Di Wu, Xiaoxian Shen, Kai-Wei Chang

    Abstract: Traditional keyphrase prediction methods predict a single set of keyphrases per document, failing to cater to the diverse needs of users and downstream applications. To bridge the gap, we introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents. For this task, we present MetaKP, a large-scale benchmark comprising four… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  6. arXiv:2407.00167  [pdf, other

    cs.CL cs.AI cs.ET cs.HC cs.SI

    Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach

    Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Wyatt Bellamy, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

    Abstract: In recent years, the United States has witnessed a significant surge in the popularity of vaping or e-cigarette use, leading to a notable rise in cases of e-cigarette and vaping use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cessation. Due… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Accepted for the AI Applications in Public Health and Social Services workshop at the 22nd International Conference on Artificial Intelligence in Medicine (AIME 2024)

  7. arXiv:2406.18137  [pdf, ps, other

    stat.ML cs.LG

    Sparse deep neural networks for nonparametric estimation in high-dimensional sparse regression

    Authors: Dongya Wu, Xin Li

    Abstract: Generalization theory has been established for sparse deep neural networks under high-dimensional regime. Beyond generalization, parameter estimation is also important since it is crucial for variable selection and interpretability of deep neural networks. Current theoretical studies concerning parameter estimation mainly focus on two-layer neural networks, which is due to the fact that the conver… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  8. arXiv:2406.17456  [pdf, other

    cs.CL cs.AI

    Improving Grammatical Error Correction via Contextual Data Augmentation

    Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Qingfu Zhu, Dayong Wu, Wanxiang Che

    Abstract: Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine-tuning phase due to inconsistent error distribution and noisy labels. In this paper, we propose a synthetic data construction me… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted as Findings of ACL 2024

  9. arXiv:2406.13692  [pdf, other

    cs.CL

    Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

    Authors: Di Wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, Kai-Wei Chang

    Abstract: Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. However, there are significant trustworthiness concerns as RALMs are prone to generating unfaithful outputs, including baseless information or contradictions with the retrieved context. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decodin… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.12783  [pdf, ps, other

    cs.NE cs.DC eess.SY math.NA

    Zeroing neural dynamics solving time-variant complex conjugate matrix equation

    Authors: Jiakuang He, Dongqing Wu

    Abstract: Complex conjugate matrix equations (CCME) have aroused the interest of many researchers because of computations and antilinear systems. Existing research is dominated by its time-invariant solving methods, but lacks proposed theories for solving its time-variant version. Moreover, artificial neural networks are rarely studied for solving CCME. In this paper, starting with the earliest CCME, zeroin… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  11. arXiv:2406.11828  [pdf, other

    cs.LG stat.ML

    Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations

    Authors: Kazusato Oko, Yujin Song, Taiji Suzuki, Denny Wu

    Abstract: We study the computational and sample complexity of learning a target function $f_*:\mathbb{R}^d\to\mathbb{R}$ with additive structure, that is, $f_*(x) = \frac{1}{\sqrt{M}}\sum_{m=1}^M f_m(\langle x, v_m\rangle)$, where $f_1,f_2,...,f_M:\mathbb{R}\to\mathbb{R}$ are nonlinear link functions of single-index models (ridge functions) with diverse and near-orthogonal index features $\{v_m\}_{m=1}^M$,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: COLT 2024

  12. arXiv:2406.11551  [pdf, other

    cs.CV

    Simple Yet Efficient: Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment

    Authors: Jianan Jiang, Di Wu, Zhilin Jiang, Weiren Yu

    Abstract: Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space. However, scalability is hindered by the growing complexity of solutions, mainly due to the abstract nature of fine-grained sketches. In this paper, we propose a simple yet efficient approach to narrow the gap between the two modes. It mainly facilitate… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages,8 figures, 4 tables

  13. arXiv:2406.09829  [pdf, other

    cs.CV

    Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

    Authors: Xiangheng Shan, Dongyue Wu, Guilin Zhu, Yuanjie Shao, Nong Sang, Changxin Gao

    Abstract: Open-vocabulary semantic segmentation is a challenging task, which requires the model to output semantic masks of an image beyond a close-set vocabulary. Although many efforts have been made to utilize powerful CLIP models to accomplish this task, they are still easily overfitting to training classes due to the natural gaps in semantic information between training and new classes. To overcome this… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  14. arXiv:2406.07880  [pdf, other

    cs.CV eess.IV

    A Comprehensive Survey on Machine Learning Driven Material Defect Detection: Challenges, Solutions, and Future Prospects

    Authors: Jun Bai, Di Wu, Tristan Shelley, Peter Schubel, David Twine, John Russell, Xuesen Zeng, Ji Zhang

    Abstract: Material defects (MD) represent a primary challenge affecting product performance and giving rise to safety issues in related products. The rapid and accurate identification and localization of MD constitute crucial research endeavours in addressing contemporary challenges associated with MD. Although conventional non-destructive testing methods such as ultrasonic and X-ray approaches have mitigat… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  15. arXiv:2406.05498  [pdf, other

    cs.CR cs.AI

    SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner

    Authors: Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, Juergen Rahmel

    Abstract: Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: This paper completes its earlier vision paper, available at arXiv:2402.15727

  16. arXiv:2406.05039  [pdf, other

    cs.CV cs.CL

    Bootstrapping Referring Multi-Object Tracking

    Authors: Yani Zhang, Dongming Wu, Wencheng Han, Xingping Dong

    Abstract: Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects following human instruction represented by a natural language expression. Existing RMOT benchmarks are usually formulated through manual annotations, integrated with static regulations. This approach results in a dearth of notable diversity and a constrained scope of implementation. In this work, our key idea is… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  17. arXiv:2406.02059  [pdf, other

    cs.LG

    Graph Adversarial Diffusion Convolution

    Authors: Songtao Liu, Jinghui Chen, Tianfan Fu, Lu Lin, Marinka Zitnik, Dinghao Wu

    Abstract: This paper introduces a min-max optimization formulation for the Graph Signal Denoising (GSD) problem. In this formulation, we first maximize the second term of GSD by introducing perturbations to the graph structure based on Laplacian distance and then minimize the overall loss of the GSD. By solving the min-max optimization problem, we derive a new variant of the Graph Diffusion Convolution (GDC… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  18. arXiv:2406.01581  [pdf, other

    cs.LG stat.ML

    Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

    Authors: Jason D. Lee, Kazusato Oko, Taiji Suzuki, Denny Wu

    Abstract: We study the problem of gradient descent learning of a single-index target function $f_*(\boldsymbol{x}) = \textstyleσ_*\left(\langle\boldsymbol{x},\boldsymbolθ\rangle\right)$ under isotropic Gaussian data in $\mathbb{R}^d$, where the link function $σ_*:\mathbb{R}\to\mathbb{R}$ is an unknown degree $q$ polynomial with information exponent $p$ (defined as the lowest degree in the Hermite expansion)… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 34 pages

  19. arXiv:2406.00714  [pdf, other

    cs.CV

    A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

    Authors: Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu

    Abstract: With the rapid advancement of autonomous driving technology, there is a growing need for enhanced safety and efficiency in the automatic environmental perception of vehicles during their operation. In modern vehicle setups, cameras and mmWave radar (radar), being the most extensively employed sensors, demonstrate complementary characteristics, inherently rendering them conducive to fusion and faci… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  20. arXiv:2406.00645  [pdf, other

    cs.LG cs.AI cs.CV

    FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

    Authors: Yuwei Fu, Haichao Zhang, Di Wu, Wei Xu, Benoit Boulet

    Abstract: In this work, we investigate how to leverage pre-trained visual-language models (VLM) for online Reinforcement Learning (RL). In particular, we focus on sparse reward tasks with pre-defined textual task descriptions. We first identify the problem of reward misalignment when applying VLM as a reward in RL tasks. To address this issue, we introduce a lightweight fine-tuning method, named Fuzzy VLM r… ▽ More

    Submitted 4 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  21. arXiv:2406.00262  [pdf, other

    cs.LG cs.AI

    Contrastive Learning Via Equivariant Representation

    Authors: Sifan Song, Jinfeng Wang, Qiaochu Zhao, Xiang Li, Dufan Wu, Angelos Stefanidis, Jionglong Su, S. Kevin Zhou, Quanzheng Li

    Abstract: Invariant-based Contrastive Learning (ICL) methods have achieved impressive performance across various domains. However, the absence of latent space representation for distortion (augmentation)-related information in the latent space makes ICL sub-optimal regarding training efficiency and robustness in downstream tasks. Recent studies suggest that introducing equivariance into Contrastive Learning… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Preprint. Under review

  22. arXiv:2405.20849  [pdf, ps, other

    cs.DS math.PR

    Locally Stationary Distributions: A Framework for Analyzing Slow-Mixing Markov Chains

    Authors: Kuikui Liu, Sidhanth Mohanty, Prasad Raghavendra, Amit Rajaraman, David X. Wu

    Abstract: Many natural Markov chains fail to mix to their stationary distribution in polynomially many steps. Often, this slow mixing is inevitable since it is computationally intractable to sample from their stationary measure. Nevertheless, Markov chains can be shown to always converge quickly to measures that are *locally stationary*, i.e., measures that don't change over a small number of steps. These… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 34 pages

  23. arXiv:2405.20614  [pdf, other

    cs.CV

    EPIDetect: Video-based convulsive seizure detection in chronic epilepsy mouse model for anti-epilepsy drug screening

    Authors: Junming Ren, Zhoujian Xiao, Yujia Zhang, Yujie Yang, Ling He, Ezra Yoon, Stephen Temitayo Bello, Xi Chen, Dapeng Wu, Micky Tortorella, Jufang He

    Abstract: In the preclinical translational studies, drug candidates with remarkable anti-epileptic efficacy demonstrate long-term suppression of spontaneous recurrent seizures (SRSs), particularly convulsive seizures (CSs), in mouse models of chronic epilepsy. However, the current methods for monitoring CSs have limitations in terms of invasiveness, specific laboratory settings, high cost, and complex opera… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  24. arXiv:2405.20584  [pdf, other

    cs.CV cs.AI

    Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization

    Authors: Yisu Liu, Jinyang An, Wanqian Zhang, Dayan Wu, Jingzi Gu, Zheng Lin, Weiping Wang

    Abstract: With the development of diffusion-based customization methods like DreamBooth, individuals now have access to train the models that can generate their personalized images. Despite the convenience, malicious users have misused these techniques to create fake images, thereby triggering a privacy security crisis. In light of this, proactive adversarial attacks are proposed to protect users against cu… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Under review

    ACM Class: I.2.10

  25. arXiv:2405.19630  [pdf

    cs.RO

    The use of a humanoid robot for older people with dementia in aged care facilities

    Authors: Dongjun Wu, Lihui Pu, Jun Jo, Rene Hexel, Wendy Moyle

    Abstract: This paper presents an interdisciplinary PhD project using a humanoid robot to encourage interactive activities for people with dementia living in two aged care facilities. The aim of the project was to develop software and use technologies to achieve successful robot-led engagement with older people with dementia. This paper outlines the qualitative findings from the project's feasibility stage.… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted for the Second Workshop on Care Robots for Older Adults (CROA), RO-MAN 2023, Busan, Korea

  26. arXiv:2405.18361  [pdf, other

    cs.CV

    Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

    Authors: Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang

    Abstract: Rapid advancements in Autonomous Driving (AD) tasks turned a significant shift toward end-to-end fashion, particularly in the utilization of vision-language models (VLMs) that integrate robust logical reasoning and cognitive abilities to enable comprehensive end-to-end planning. However, these VLM-based approaches tend to integrate 2D vision tokenizers and a large language model (LLM) for ego-car… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  27. arXiv:2405.16789  [pdf, other

    cs.IR

    NoteLLM-2: Multimodal Large Representation Models for Recommendation

    Authors: Chao Zhang, Haoxin Zhang, Shiwei Wu, Di Wu, Tong Xu, Yan Gao, Yao Hu, Enhong Chen

    Abstract: Large Language Models (LLMs) have demonstrated exceptional text understanding. Existing works explore their application in text embedding tasks. However, there are few works utilizing LLMs to assist multimodal representation tasks. In this work, we investigate the potential of LLMs to enhance multimodal representation in multimodal item-to-item (I2I) recommendations. One feasible method is the tra… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 19 pages, 5 figures

  28. arXiv:2405.15176  [pdf, other

    cs.CV

    MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method

    Authors: Pan Liao, Feng Yang, Di Wu, Liu Bo

    Abstract: Monocular vision-based 3D object detection is crucial in various sectors, yet existing methods face significant challenges in terms of accuracy and computational efficiency. Building on the successful strategies in 2D detection and depth estimation, we propose MonoDETRNext, which seeks to optimally balance precision and processing speed. Our methodology includes the development of an efficient hyb… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  29. arXiv:2405.14691  [pdf, other

    cs.AI cs.MA

    CityGPT: Towards Urban IoT Learning, Analysis and Interaction with Multi-Agent System

    Authors: Qinghua Guan, Jinhui Ouyang, Di Wu, Weiren Yu

    Abstract: The spatiotemporal data generated by massive sensors in the Internet of Things (IoT) is extremely dynamic, heterogeneous, large scale and time-dependent. It poses great challenges (e.g. accuracy, reliability, and stability) in real-time analysis and decision making for different IoT applications. The complexity of IoT data prevents the common people from gaining a deeper understanding of it. Agent… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  30. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  31. arXiv:2405.10812  [pdf, other

    q-bio.GN cs.AI

    VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

    Authors: Siyuan Li, Zedong Wang, Zicheng Liu, Di Wu, Cheng Tan, Jiangbin Zheng, Yufei Huang, Stan Z. Li

    Abstract: Similar to natural language models, pre-trained genome language models are proposed to capture the underlying intricacies within genomes with unsupervised sequence modeling. They have become essential tools for researchers and practitioners in biology. However, the hand-crafted tokenization policies used in these models may not encode the most discriminative patterns from the limited vocabulary of… ▽ More

    Submitted 2 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: ICML 2024. Preprint V2 with 17 pages and 5 figures

  32. arXiv:2405.07744  [pdf, other

    cs.SE

    MoCo: Fuzzing Deep Learning Libraries via Assembling Code

    Authors: Pin Ji, Yang Feng, Duo Wu, Lingyue Yan, Pengling Chen, Jia Liu, Zhihong Zhao

    Abstract: The rapidly developing deep learning (DL) techniques have been applied in software systems with various application scenarios. However, they could also pose new safety threats with potentially serious consequences, especially in safety-critical domains. DL libraries serve as the underlying foundation for DL systems, and bugs in them can have unpredictable impacts that directly affect the behaviors… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  33. arXiv:2405.06616  [pdf, ps, other

    math.PR cs.DS math.CO

    Fast Mixing in Sparse Random Ising Models

    Authors: Kuikui Liu, Sidhanth Mohanty, Amit Rajaraman, David X. Wu

    Abstract: Motivated by the community detection problem in Bayesian inference, as well as the recent explosion of interest in spin glasses from statistical physics, we study the classical Glauber dynamics for sampling from Ising models with sparse random interactions. It is now well-known that when the interaction matrix has spectral diameter less than $1$, Glauber dynamics mixes in $O(n\log n)$ steps. Unfor… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 66 pages, 4 figures

  34. arXiv:2405.05985  [pdf, other

    cs.LG cs.AI

    TrafficGPT: Towards Multi-Scale Traffic Analysis and Generation with Spatial-Temporal Agent Framework

    Authors: Jinhui Ouyang, Yijie Zhu, Xiang Yuan, Di Wu

    Abstract: The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semanti… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  35. arXiv:2405.02580  [pdf, other

    cs.SE cs.AI

    PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

    Authors: Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu

    Abstract: With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  36. arXiv:2405.01844  [pdf, other

    cs.NI cs.CR cs.DC

    A Survey on Privacy-Preserving Caching at Network Edge: Classification, Solutions, and Challenges

    Authors: Xianzhi Zhang, Yipeng Zhou, Di Wu, Shazia Riaz, Quan Z. Sheng, Miao Hu, Linchang Xiao

    Abstract: Caching content at the network edge is a popular and effective technique widely deployed to alleviate the burden of network backhaul, shorten service delay and improve service quality. However, there has been some controversy over privacy violations in caching content at the network edge. On the one hand, the multi-access open edge network provides an ideal surface for external attackers to obtain… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  37. arXiv:2405.00699  [pdf, other

    cs.NE cs.AI cs.LG

    Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

    Authors: Dengyu Wu, Yi Qi, Kaiwen Cai, Gaojie Jin, Xinping Yi, Xiaowei Huang

    Abstract: Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting i… ▽ More

    Submitted 15 April, 2024; originally announced May 2024.

  38. arXiv:2404.17900  [pdf, other

    cs.CV

    Unsupervised Anomaly Detection via Masked Diffusion Posterior Sampling

    Authors: Di Wu, Shicai Fan, Xue Zhou, Li Yu, Yuzhong Deng, Jianxiao Zou, Baihong Lin

    Abstract: Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image re… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Journal ref: International Joint Conference on Artificial Intelligence 2024

  39. arXiv:2404.17833  [pdf, other

    cs.AI cs.PL

    Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs

    Authors: Zhenlan Ji, Daoyuan Wu, Pingchuan Ma, Zongjie Li, Shuai Wang

    Abstract: Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Neverth… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  40. arXiv:2404.17607  [pdf, other

    cs.IR cs.AI cs.CL cs.LG cs.SI

    Utilizing Large Language Models to Identify Reddit Users Considering Vaping Cessation for Digital Interventions

    Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Caleb Henry, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

    Abstract: The widespread adoption of social media platforms globally not only enhances users' connectivity and communication but also emerges as a vital channel for the dissemination of health-related information, thereby establishing social media data as an invaluable organic data resource for public health research. The surge in popularity of vaping or e-cigarette use in the United States and other countr… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  41. arXiv:2404.16407  [pdf, other

    cs.CL eess.AS

    U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF

    Authors: Xingchen Song, Di Wu, Binbin Zhang, Dinghao Zhou, Zhendong Peng, Bo Dang, Fuping Pan, Chao Yang

    Abstract: Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to even larger and more capable language models and this shift towards a new generation of foundation models is gaining momentum, particularly within the… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    ACM Class: I.2.7

  42. arXiv:2404.14061  [pdf, other

    cs.LG cs.AI cs.DB cs.SI

    FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning

    Authors: Yinlin Zhu, Xunkai Li, Zhengyu Wu, Di Wu, Miao Hu, Rong-Hua Li

    Abstract: Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have no… ▽ More

    Submitted 25 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  43. arXiv:2404.13945  [pdf, other

    cs.SE

    How do LLMs Support Deep Learning Testing? A Comprehensive Study Through the Lens of Image Mutation

    Authors: Liwen Wang, Yuanyuan Yuan, Ao Sun, Zongjie Li, Pingchuan Ma, Daoyuan Wu, Shuai Wang

    Abstract: Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing, which requires diverse and controllable mutations over image semantics. The rapid development of multi-modal large language models (MLLMs) has introduced revoluti… ▽ More

    Submitted 5 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  44. arXiv:2404.12256  [pdf, other

    cs.RO cs.AI cs.LG

    An Online Spatial-Temporal Graph Trajectory Planner for Autonomous Vehicles

    Authors: Jilan Samiuddin, Benoit Boulet, Di Wu

    Abstract: The autonomous driving industry is expected to grow by over 20 times in the coming decade and, thus, motivate researchers to delve into it. The primary focus of their research is to ensure safety, comfort, and efficiency. An autonomous vehicle has several modules responsible for one or more of the aforementioned items. Among these modules, the trajectory planner plays a pivotal role in the safety… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: This is the accepted version and published in the "Early Access" area of IEEE Xplore for the IEEE Transactions on Intelligent Vehicles on 16 April 2024. Article statistics: 11 pages, 9 figures, 2 tables

  45. arXiv:2404.11201  [pdf, other

    cs.CL

    Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation

    Authors: Shaomu Tan, Di Wu, Christof Monz

    Abstract: Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity and struggle to foster cross-lingual transfer via isolated modules. In this paper, we explore intrinsic task modularity within multilingual networks… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  46. arXiv:2404.03900  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Nonparametric Modern Hopfield Models

    Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

    Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

  47. arXiv:2404.03827  [pdf, other

    cs.LG cs.AI stat.ML

    Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

    Abstract: We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space.… ▽ More

    Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/UHop

  48. arXiv:2403.17413  [pdf, other

    cs.CL

    LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction

    Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Dayong Wu, Wanxiang Che

    Abstract: Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewri… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to COLING 2024

  49. arXiv:2403.17312  [pdf, other

    cs.AI cs.LG cs.PF

    ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching

    Authors: Youpeng Zhao, Di Wu, Jun Wang

    Abstract: The Transformer architecture has significantly advanced natural language processing (NLP) and has been foundational in developing large language models (LLMs) such as LLaMA and OPT, which have come to dominate a broad range of NLP tasks. Despite their superior accuracy, LLMs present unique challenges in practical inference, concerning the compute and memory-intensive nature. Thanks to the autoregr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: ISCA 2024

  50. arXiv:2403.16073  [pdf, other

    cs.SE

    Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications

    Authors: Wei Ma, Daoyuan Wu, Yuqiang Sun, Tianwen Wang, Shangqing Liu, Jian Zhang, Yue Xue, Yang Liu

    Abstract: Smart contracts are decentralized applications built atop blockchains like Ethereum. Recent research has shown that large language models (LLMs) have potential in auditing smart contracts, but the state-of-the-art indicates that even GPT-4 can achieve only 30% precision (when both decision and justification are correct). This is likely because off-the-shelf LLMs were primarily pre-trained on a gen… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.