Skip to main content

Showing 1–50 of 320 results for author: Pan, L

  1. arXiv:2407.02777  [pdf, other

    cs.RO

    Hierarchical Large Scale Multirobot Path (Re)Planning

    Authors: Lishuo Pan, Kevin Hsu, Nora Ayanian

    Abstract: We consider a large-scale multi-robot path planning problem in a cluttered environment. Our approach achieves real-time replanning by dividing the workspace into cells and utilizing a hierarchical planner. Specifically, multi-commodity flow-based high-level planners route robots through the cells to reduce congestion, while an anytime low-level planner computes collision-free paths for robots with… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages, 7 figures, 1 table. Accepted by IROS2024

  2. arXiv:2407.01523  [pdf, other

    cs.CV cs.CL

    MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

    Authors: Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun

    Abstract: Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.19215  [pdf, other

    cs.CL

    SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

    Authors: Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li

    Abstract: This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states. SeaKR activates retrieval when the LLMs present high self-aware uncertainty for generation. To effectively integrate retrieved knowledge snippets, SeaKR re-ranks them based on LLM's self-aware uncertainty to preserve the snippet that redu… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.14867  [pdf, other

    cs.LG cs.AI cs.CL

    DistiLRR: Transferring Code Repair for Low-Resource Programming Languages

    Authors: Kyle Wong, Alfonso Amayuelas, Liangming Pan, William Yang Wang

    Abstract: Large language models (LLMs) have shown remarkable performance on code generation tasks. A recent application of LLMs for code generation is iterative code repair, where a model fixes an incorrect program by rationalizing about errors and generating a new program. However, code repair is primarily studied on high-resource languages like Python, and the framework's efficacy is under-explored on low… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  5. arXiv:2406.14711  [pdf, other

    cs.CL cs.AI cs.MA

    MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

    Authors: Alfonso Amayuelas, Xianjun Yang, Antonis Antoniades, Wenyue Hua, Liangming Pan, William Wang

    Abstract: Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents, enabling interactions among multiple models to execute complex tasks. Such collaborations offer several advantages, including the use of sp… ▽ More

    Submitted 26 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.12376  [pdf, other

    cs.CR

    DCS Chain: A Flexible Private Blockchain System

    Authors: Jianwu Zheng, Siyuan Zhao, Zheng Wang, Li Pan, Jianhua Li

    Abstract: Blockchain technology has seen tremendous development over the past few years. Despite the emergence of numerous blockchain systems, they all suffer from various limitations, which can all be attributed to the fundamental issue posed by the DCS trilemma. In light of this, this work introduces a novel private blockchain system named DCS Chain. The core idea is to quantify the DCS metrics and dynami… ▽ More

    Submitted 19 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.11389  [pdf, other

    cs.LG

    SEFraud: Graph-based Self-Explainable Fraud Detection via Interpretative Mask Learning

    Authors: Kaidi Li, Tianmeng Yang, Min Zhou, Jiahao Meng, Shendi Wang, Yihui Wu, Boshuai Tan, Hu Song, Lujia Pan, Fan Yu, Zhenli Sheng, Yunhai Tong

    Abstract: Graph-based fraud detection has widespread application in modern industry scenarios, such as spam review and malicious account detection. While considerable efforts have been devoted to designing adequate fraud detectors, the interpretability of their results has often been overlooked. Previous works have attempted to generate explanations for specific instances using post-hoc explaining methods s… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  8. arXiv:2406.10870  [pdf, other

    cs.CL

    COOL: Comprehensive Knowledge Enhanced Prompt Learning for Domain Adaptive Few-shot Fake News Detection

    Authors: Yi Ouyang, Peng Wu, Li Pan

    Abstract: Most Fake News Detection (FND) methods often struggle with data scarcity for emerging news domain. Recently, prompt learning based on Pre-trained Language Models (PLM) has emerged as a promising approach in domain adaptive few-shot learning, since it greatly reduces the need for labeled data by bridging the gap between pre-training and downstream task. Furthermore, external knowledge is also helpf… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  9. arXiv:2406.02213  [pdf, other

    cs.LG

    Rectifying Reinforcement Learning for Reward Matching

    Authors: Haoran He, Emmanuel Bengio, Qingpeng Cai, Ling Pan

    Abstract: The Generative Flow Network (GFlowNet) is a probabilistic framework in which an agent learns a stochastic policy and flow functions to sample objects with probability proportional to an unnormalized reward function. GFlowNets share a strong resemblance to reinforcement learning (RL), that typically aims to maximize reward, due to their sequential decision-making processes. Recent works have studie… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  10. arXiv:2406.01901  [pdf, other

    cs.LG

    Bifurcated Generative Flow Networks

    Authors: Chunhui Li, Cheng-Hao Liu, Dianbo Liu, Qingpeng Cai, Ling Pan

    Abstract: Generative Flow Networks (GFlowNets), a new family of probabilistic samplers, have recently emerged as a promising framework for learning stochastic policies that generate high-quality and diverse objects proportionally to their rewards. However, existing GFlowNets often suffer from low data efficiency due to the direct parameterization of edge flows or reliance on backward policies that may strug… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  11. arXiv:2406.01150  [pdf, other

    cs.LG

    Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets

    Authors: Haoran He, Can Chang, Huazhe Xu, Ling Pan

    Abstract: Generative Flow Networks (GFlowNets) are amortized sampling methods for learning a stochastic policy to sequentially generate compositional objects with probabilities proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse sets of high-reward objects, in contrast to standard return maximization reinforcement learning approaches, which often converge to a single op… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  12. arXiv:2405.18357  [pdf, other

    cs.CL

    Faithful Logical Reasoning via Symbolic Chain-of-Thought

    Authors: Jundong Xu, Hao Fei, Liangming Pan, Qian Liu, Mong-Li Lee, Wynne Hsu

    Abstract: While the recent Chain-of-Thought (CoT) technique enhances the reasoning ability of large language models (LLMs) with the theory of mind, it might still struggle in handling logical reasoning that relies much on symbolic expressions and rigid deducing rules. To strengthen the logical reasoning capability of LLMs, we propose a novel Symbolic Chain-of-Thought, namely SymbCoT, a fully LLM-based frame… ▽ More

    Submitted 11 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 (main proceeding)

  13. arXiv:2405.17957  [pdf, other

    cs.CL cs.AI

    Modeling Dynamic Topics in Chain-Free Fashion by Evolution-Tracking Contrastive Learning and Unassociated Word Exclusion

    Authors: Xiaobao Wu, Xinshuai Dong, Liangming Pan, Thong Nguyen, Anh Tuan Luu

    Abstract: Dynamic topic models track the evolution of topics in sequential documents, which have derived various applications like trend analysis and opinion mining. However, existing models suffer from repetitive topic and unassociated topic issues, failing to reveal the evolution and hindering further applications. To address these issues, we break the tradition of simply chaining topics in existing work… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 Findings

  14. arXiv:2405.17478  [pdf, other

    cs.LG stat.ML

    ROSE: Register Assisted General Time Series Forecasting with Decomposed Frequency Learning

    Authors: Yihang Wang, Yuying Qiu, Peng Chen, Kai Zhao, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

    Abstract: With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  15. arXiv:2405.17426  [pdf, other

    cs.CV cs.RO

    Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving

    Authors: Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Recent advancements in bird's eye view (BEV) representations have shown remarkable promise for in-vehicle 3D perception. However, while these methods have achieved impressive results on standard benchmarks, their robustness in varied conditions remains insufficiently assessed. In this study, we present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. Thi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 13 figures, 11 tables; Code at this https URL: https://github.com/Daniel-xsy/RoboBEV

  16. arXiv:2405.15273  [pdf, other

    cs.LG

    Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders

    Authors: Qichao Shentu, Beibu Li, Kai Zhao, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

    Abstract: Time series anomaly detection plays a vital role in a wide range of applications. Existing methods require training one specific model for each dataset, which exhibits limited generalization capability across different target datasets, hindering anomaly detection performance in various scenarios with scarce training data. Aiming at this problem, we propose constructing a general time series anomal… ▽ More

    Submitted 2 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  17. arXiv:2405.10051  [pdf, other

    cs.CR cs.CL

    MarkLLM: An Open-Source Toolkit for LLM Watermarking

    Authors: Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King

    Abstract: LLM watermarking, which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of large language models. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: 16 pages, 5 figures, 6 tables

    MSC Class: 68T50 ACM Class: I.2.7

  18. arXiv:2405.08816  [pdf, other

    cs.CV cs.RO

    The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

    Authors: Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei Zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang , et al. (66 additional authors not shown)

    Abstract: In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: ICRA 2024; 32 pages, 24 figures, 5 tables; Code at https://robodrive-24.github.io/

  19. arXiv:2405.08055  [pdf, other

    cs.CV

    DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation

    Authors: Ziang Cao, Fangzhou Hong, Tong Wu, Liang Pan, Ziwei Liu

    Abstract: Generating diverse and high-quality 3D assets automatically poses a fundamental yet challenging task in 3D computer vision. Despite extensive efforts in 3D generation, existing optimization-based approaches struggle to produce large-scale 3D assets efficiently. Meanwhile, feed-forward methods often focus on generating only a single category or a few categories, limiting their generalizability. The… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2309.07920

  20. arXiv:2405.07702  [pdf, other

    cs.CV cs.LG

    FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival

    Authors: Liangrui Pan, Yijun Peng, Yan Li, Yiyi Liang, Liwen Xu, Qingchun Liang, Shaoliang Peng

    Abstract: Integrating the different data modalities of cancer patients can significantly improve the predictive performance of patient survival. However, most existing methods ignore the simultaneous utilization of rich semantic features at different scales in pathology images. When collecting multimodal data and extracting features, there is a likelihood of encountering intra-modality missing data, introdu… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  21. arXiv:2405.07673  [pdf, other

    cs.CL

    An Empirical Study on the Robustness of Massively Multilingual Neural Machine Translation

    Authors: Supryadi, Leiyu Pan, Deyi Xiong

    Abstract: Massively multilingual neural machine translation (MMNMT) has been proven to enhance the translation quality of low-resource languages. In this paper, we empirically investigate the translation robustness of Indonesian-Chinese translation in the face of various naturally occurring noise. To assess this, we create a robustness evaluation benchmark dataset for Indonesian-Chinese translation. This da… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures

  22. arXiv:2405.05258  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

    Authors: Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

    Abstract: Efficient data utilization is crucial for advancing 3D scene understanding in autonomous driving, where reliance on heavily human-annotated LiDAR point clouds challenges fully supervised methods. Addressing this, our study extends into semi-supervised learning for LiDAR semantic segmentation, leveraging the intrinsic spatial priors of driving scenes and multi-sensor complements to augment the effi… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 6 figures, 8 tables; Code at https://github.com/ldkong1205/LaserMix

  23. arXiv:2405.01538  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Space Alignments Towards Universal LiDAR Segmentation

    Authors: Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma

    Abstract: A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception. This work presents M3Net, a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation in a universal manner using just a single set of parameters. To better exploit data volume and diversity, we first combine lar… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: CVPR 2024; 33 pages, 14 figures, 14 tables; Code at https://github.com/youquanl/M3Net

  24. arXiv:2405.01333  [pdf, other

    cs.RO cs.CV

    NeRF in Robotics: A Survey

    Authors: Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang

    Abstract: Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as sim… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 21 pages, 19 figures

  25. arXiv:2405.00909  [pdf, other

    cs.LG cs.ET quant-ph

    Quantum Federated Learning Experiments in the Cloud with Data Encoding

    Authors: Shiva Raj Pokhrel, Naman Yash, Jonathan Kua, Gang Li, Lei Pan

    Abstract: Quantum Federated Learning (QFL) is an emerging concept that aims to unfold federated learning (FL) over quantum networks, enabling collaborative quantum model training along with local data privacy. We explore the challenges of deploying QFL on cloud platforms, emphasizing quantum intricacies and platform limitations. The proposed data-encoding-driven QFL, with a proof of concept (GitHub Open Sou… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: SIGCOMM 2024, Quantum Computing, Federated Learning, Qiskit

  26. arXiv:2404.16164  [pdf, other

    cs.CL cs.AI cs.LG

    Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall

    Authors: Jiaqing Yuan, Lin Pan, Chung-Wei Hang, Jiang Guo, Jiarong Jiang, Bonan Min, Patrick Ng, Zhiguo Wang

    Abstract: Large language models (LLMs) have shown remarkable performance on a variety of NLP tasks, and are being rapidly adopted in a wide range of use cases. It is therefore of vital importance to holistically evaluate the factuality of their generated outputs, as hallucinations remain a challenging issue. In this work, we focus on assessing LLMs' ability to recall factual knowledge learned from pretrai… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  27. arXiv:2404.15974  [pdf, other

    cs.HC

    A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples

    Authors: Lihang Pan, Yuxuan Li, Chun Yu, Yuanchun Shi

    Abstract: The capabilities of a single large language model (LLM) agent for solving a complex task are limited. Connecting multiple LLM agents to a network can effectively improve overall performance. However, building an LLM agent network (LAN) requires a substantial amount of time and effort. In this paper, we introduce EasyLAN, a human-computer collaborative tool that helps developers construct LANs. Eas… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  28. arXiv:2404.11291  [pdf, other

    cs.CV

    Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

    Authors: Buzhen Huang, Chen Li, Chongyang Xu, Liang Pan, Yangang Wang, Gim Hee Lee

    Abstract: Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration, but overlook the modeling of close interactions. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. The main challenge of this task comes from insufficient visual information caused by depth ambiguity and severe inter-person occ… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: CVPR2024

  29. Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning

    Authors: Yupei Zhang, Li Pan, Qiushi Yang, Tan Li, Zhen Chen

    Abstract: Medical multi-modal pre-training has revealed promise in computer-aided diagnosis by leveraging large-scale unlabeled datasets. However, existing methods based on masked autoencoders mainly rely on data-level reconstruction tasks, but lack high-level semantic information. Furthermore, two significant heterogeneity challenges hinder the transfer of pre-trained knowledge to downstream tasks, \textit… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: to be published in IEEE JBHI; Code available at https://github.com/helenypzhang/UMD

  30. arXiv:2403.17010  [pdf, other

    cs.CV cs.LG cs.RO

    Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

    Authors: Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models. This study introduces Calib3D, a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models from an uncertainty estimation viewpoint. We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets, uncovering… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Preprint; 37 pages, 8 figures, 11 tables; Code at https://github.com/ldkong1205/Calib3D

  31. arXiv:2403.16112  [pdf, other

    cs.CV cs.AI cs.LG

    Opportunities and challenges in the application of large artificial intelligence models in radiology

    Authors: Liangrui Pan, Zhenyu Zhao, Ying Lu, Kewei Tang, Liyong Fu, Qingchun Liang, Shaoliang Peng

    Abstract: Influenced by ChatGPT, artificial intelligence (AI) large models have witnessed a global upsurge in large model research and development. As people enjoy the convenience by this AI large model, more and more large models in subdivided fields are gradually being proposed, especially large models in radiology imaging field. This article first introduces the development history of large models, techn… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  32. arXiv:2403.14733  [pdf

    cs.AI cs.CL cs.LG

    Open Knowledge Base Canonicalization with Multi-task Learning

    Authors: Bingchen Liu, Huang Peng, Weixin Zeng, Xiang Zhao, Shijun Liu, Li Pan

    Abstract: The construction of large open knowledge bases (OKBs) is integral to many knowledge-driven applications on the world wide web such as web search. However, noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization. Current solutions address OKB canonicalization by devising advanced clustering algorithms and usin… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.16419

  33. arXiv:2403.09290  [pdf, other

    cs.CV cs.AI cs.LG

    SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival

    Authors: Liangrui Pan, Yijun Peng, Yan Li, Xiang Wang, Wenjuan Liu, Liwen Xu, Qingchun Liang, Shaoliang Peng

    Abstract: Accurately predicting the survival rate of cancer patients is crucial for aiding clinicians in planning appropriate treatment, reducing cancer-related medical expenses, and significantly enhancing patients' quality of life. Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach. However, existing methods still grapple with challenges related to missing mu… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted on Computers in Biology and Medicine

  34. arXiv:2403.07923  [pdf

    cs.NI cs.AI cs.LG eess.IV eess.SY

    The Fusion of Deep Reinforcement Learning and Edge Computing for Real-time Monitoring and Control Optimization in IoT Environments

    Authors: Jingyu Xu, Weixiang Wan, Linying Pan, Wenjian Sun, Yuxiang Liu

    Abstract: In response to the demand for real-time performance and control quality in industrial Internet of Things (IoT) environments, this paper proposes an optimization control system based on deep reinforcement learning and edge computing. The system leverages cloud-edge collaboration, deploys lightweight policy networks at the edge, predicts system states, and outputs controls at a high frequency, enabl… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  35. arXiv:2403.06993  [pdf

    cs.RO cs.AI cs.LG eess.IV eess.SY

    Automatic driving lane change safety prediction model based on LSTM

    Authors: Wenjian Sun, Linying Pan, Jingyu Xu, Weixiang Wan, Yong Wang

    Abstract: Autonomous driving technology can improve traffic safety and reduce traffic accidents. In addition, it improves traffic flow, reduces congestion, saves energy and increases travel efficiency. In the relatively mature automatic driving technology, the automatic driving function is divided into several modules: perception, decision-making, planning and control, and a reasonable division of labor can… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  36. arXiv:2403.02234  [pdf, other

    cs.CV

    3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors

    Authors: Fangzhou Hong, Jiaxiang Tang, Ziang Cao, Min Shi, Tong Wu, Zhaoxi Chen, Shuai Yang, Tengfei Wang, Liang Pan, Dahua Lin, Ziwei Liu

    Abstract: We present a two-stage text-to-3D generation system, namely 3DTopia, which generates high-quality general 3D assets within 5 minutes using hybrid diffusion priors. The first stage samples from a 3D diffusion prior directly learned from 3D data. Specifically, it is powered by a text-conditioned tri-plane latent diffusion model, which quickly generates coarse 3D samples for fast prototyping. The sec… ▽ More

    Submitted 6 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Code available at https://github.com/3DTopia/3DTopia

  37. arXiv:2403.00869  [pdf, other

    cs.LG stat.ML

    Enhancing Multivariate Time Series Forecasting with Mutual Information-driven Cross-Variable and Temporal Modeling

    Authors: Shiyi Qi, Liangjian Wen, Yiduo Li, Yuanhang Yang, Zhe Li, Zhongwen Rao, Lujia Pan, Zenglin Xu

    Abstract: Recent advancements have underscored the impact of deep learning techniques on multivariate time series forecasting (MTSF). Generally, these techniques are bifurcated into two categories: Channel-independence and Channel-mixing approaches. Although Channel-independence methods typically yield better results, Channel-mixing could theoretically offer improvements by leveraging inter-variable correla… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  38. arXiv:2402.18909  [pdf, other

    cs.CL cs.AI

    Updating Language Models with Unstructured Facts: Towards Practical Knowledge Editing

    Authors: Xiaobao Wu, Liangming Pan, William Yang Wang, Anh Tuan Luu

    Abstract: Knowledge editing aims to inject knowledge updates into language models to keep them correct and up-to-date. However, its current evaluation strategies are notably impractical: they solely update with well-curated structured facts (triplets with subjects, relations, and objects), whereas real-world knowledge updates commonly emerge in unstructured texts like news articles. In this paper, we propos… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  39. arXiv:2402.16827  [pdf, other

    cs.CL cs.LG

    A Survey on Data Selection for Language Models

    Authors: Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang

    Abstract: A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training. However, naively training a model on all available data may not be optimal (or feasible), as the quality of available text data can vary. Filtering out data can also decrease the carbon footprint and financial costs of training models by reducing the am… ▽ More

    Submitted 8 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Paper list available at https://github.com/alon-albalak/data-selection-survey

  40. arXiv:2402.14407  [pdf, other

    cs.LG cs.CV cs.RO

    Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning

    Authors: Haoran He, Chenjia Bai, Ling Pan, Weinan Zhang, Bin Zhao, Xuelong Li

    Abstract: Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to fa… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 21 pages

  41. arXiv:2402.13647  [pdf, other

    cs.CL cs.AI

    Unsupervised Text Style Transfer via LLMs and Attention Masking with Multi-way Interactions

    Authors: Lei Pan, Yunshi Lan, Yang Li, Weining Qian

    Abstract: Unsupervised Text Style Transfer (UTST) has emerged as a critical task within the domain of Natural Language Processing (NLP), aiming to transfer one stylistic aspect of a sentence into another style without changing its semantics, syntax, or other attributes. This task is especially challenging given the intrinsic lack of parallel text pairings. Among existing methods for UTST tasks, attention ma… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  42. arXiv:2402.11451  [pdf, other

    cs.CL cs.AI

    SciAgent: Tool-augmented Language Models for Scientific Reasoning

    Authors: Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen

    Abstract: Scientific reasoning poses an excessive challenge for even the most advanced Large Language Models (LLMs). To make this task more practical and solvable for LLMs, we introduce a new task setting named tool-augmented scientific reasoning. This setting supplements LLMs with scalable toolsets, and shifts the focus from pursuing an omniscient problem solver to a proficient tool-user. To facilitate the… ▽ More

    Submitted 20 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  43. arXiv:2402.11436  [pdf, other

    cs.CL cs.AI

    Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement

    Authors: Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William Yang Wang

    Abstract: Recent studies show that large language models (LLMs) improve their performance through self-feedback on certain tasks while degrade on others. We discovered that such a contrary is due to LLM's bias in evaluating their own output. In this paper, we formally define LLM's self-bias - the tendency to favor its own generation - using two statistics. We analyze six LLMs (GPT-4, GPT-3.5, Gemini, LLaMA2… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  44. arXiv:2402.09442  [pdf

    eess.SP cs.AI

    Progress in artificial intelligence applications based on the combination of self-driven sensors and deep learning

    Authors: Weixiang Wan, Wenjian Sun, Qiang Zeng, Linying Pan, Jingyu Xu, Bo Liu

    Abstract: In the era of Internet of Things, how to develop a smart sensor system with sustainable power supply, easy deployment and flexible use has become a difficult problem to be solved. The traditional power supply has problems such as frequent replacement or charging when in use, which limits the development of wearable devices. The contact-to-separate friction nanogenerator (TENG) was prepared by usin… ▽ More

    Submitted 12 March, 2024; v1 submitted 30 January, 2024; originally announced February 2024.

    Comments: This aticle was accepted by ieee conference

  45. arXiv:2402.05234  [pdf, other

    cs.LG

    QGFN: Controllable Greediness with Action Values

    Authors: Elaine Lau, Stephen Zhewen Lu, Ling Pan, Doina Precup, Emmanuel Bengio

    Abstract: Generative Flow Networks (GFlowNets; GFNs) are a family of reward/energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value… ▽ More

    Submitted 23 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Under review

  46. arXiv:2402.03268  [pdf, other

    cs.LG cs.AI cs.CL

    Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation

    Authors: Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang

    Abstract: Pre-trained language models (LMs) are able to perform complex reasoning without explicit fine-tuning. To understand how pre-training with a next-token prediction objective contributes to the emergence of such reasoning capability, we propose that we can view an LM as deriving new conclusions by aggregating indirect reasoning paths seen at pre-training time. We found this perspective effective in t… ▽ More

    Submitted 20 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted to ICML 2024

  47. arXiv:2402.02399  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    FreDF: Learning to Forecast in Frequency Domain

    Authors: Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, Dacheng Tao

    Abstract: Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences. Current research predominantly focuses on handling autocorrelation within the historical sequence but often neglects its presence in the label sequence. Specifically, emerging forecast models mainly conform to the direct forecast (DF) paradigm, generating multi-step forecasts unde… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  48. arXiv:2402.02186  [pdf, other

    cs.LG cs.AI

    Evolution Guided Generative Flow Networks

    Authors: Zarif Ikram, Ling Pan, Dianbo Liu

    Abstract: Generative Flow Networks (GFlowNets) are a family of probabilistic generative models that learn to sample compositional objects proportional to their rewards. One big challenge of GFlowNets is training them effectively when dealing with long time horizons and sparse rewards. To address this, we propose Evolution guided generative flow networks (EGFN), a simple but powerful augmentation to the GFlo… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 16 pages, 16 figues

  49. arXiv:2401.13782  [pdf, other

    cs.DL cs.AI cs.CL cs.CV cs.LG cs.SI

    Tweets to Citations: Unveiling the Impact of Social Media Influencers on AI Research Visibility

    Authors: Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang

    Abstract: As the number of accepted papers at AI and ML conferences reaches into the thousands, it has become unclear how researchers access and read research publications. In this paper, we investigate the role of social media influencers in enhancing the visibility of machine learning research, particularly the citation counts of papers they share. We have compiled a comprehensive dataset of over 8,000 pa… ▽ More

    Submitted 3 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 10 Pages, 14 Figures

  50. arXiv:2401.10144  [pdf, other

    q-bio.BM cs.LG

    Exploiting Hierarchical Interactions for Protein Surface Learning

    Authors: Yiqun Lin, Liang Pan, Yi Li, Ziwei Liu, Xiaomeng Li

    Abstract: Predicting interactions between proteins is one of the most important yet challenging problems in structural bioinformatics. Intrinsically, potential function sites in protein surfaces are determined by both geometric and chemical features. However, existing works only consider handcrafted or individually learned chemical features from the atom type and extract geometric features independently. He… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Accepted to J-BHI