Skip to main content

Showing 1–50 of 160 results for author: Cheng, R

  1. arXiv:2406.17245  [pdf, other

    cs.LG cs.AI cs.CL

    Unlocking Continual Learning Abilities in Language Models

    Authors: Wenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka Chun Cheung, Reynold Cheng, Jie Fu

    Abstract: Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task informa… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: preprint, 19 pages

  2. arXiv:2406.10802  [pdf, other

    cs.CL cs.AI

    KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

    Authors: Aihua Pei, Zehua Yang, Shunan Zhu, Ruoxi Cheng, Ju Jia, Lina Wang

    Abstract: Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.07365  [pdf, other

    cs.CL cs.AI

    BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction

    Authors: Yinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao, Renhong Cheng

    Abstract: Aspect sentiment quad prediction (ASQP) aims to predict four aspect-based elements, including aspect term, opinion term, aspect category, and sentiment polarity. In practice, unseen aspects, due to distinct data distribution, impose many challenges for a trained neural model. Motivated by this, this work formulates ASQP into the few-shot scenario, which aims for fast adaptation in real application… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Main Conference

  4. arXiv:2406.06626  [pdf, other

    cs.LG cs.AI cs.HC eess.SP

    Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

    Authors: Zhou Zhou, Guohang He, Zheng Zhang, Luziwei Leng, Qinghai Guo, Jianxing Liao, Xuan Song, Ran Cheng

    Abstract: Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  5. arXiv:2405.15319  [pdf, other

    cs.CL cs.AI

    Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

    Authors: Wenyu Du, Tongxu Luo, Zihan Qiu, Zeyu Huang, Yikang Shen, Reynold Cheng, Yike Guo, Jie Fu

    Abstract: LLMs are computationally expensive to pre-train due to their large scale. Model growth emerges as a promising approach by leveraging smaller models to accelerate the training of larger ones. However, the viability of these model growth methods in efficient LLM pre-training remains underexplored. This work identifies three critical $\underline{\textit{O}}$bstacles: ($\textit{O}$1) lack of comprehen… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Preprint; The project link: $\href{https://llm-stacking.github.io/}{https://llm-stacking.github.io/}$

  6. arXiv:2405.15307  [pdf, other

    cs.CL

    Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation

    Authors: Ge Qu, Jinyang Li, Bowen Li, Bowen Qin, Nan Huo, Chenhao Ma, Reynold Cheng

    Abstract: Large Language Models (LLMs) driven by In-Context Learning (ICL) have significantly improved the performance of text-to-SQL. Previous methods generally employ a two-stage reasoning framework, namely 1) schema linking and 2) logical synthesis, making the framework not only effective but also interpretable. Despite these advancements, the inherent bad nature of the generalization of LLMs often resul… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL Findings 2024

  7. arXiv:2405.14517  [pdf, other

    cs.LG cs.CR

    Identity Inference from CLIP Models using Only Textual Data

    Authors: Songze Li, Ruoxi Cheng, Xiaojun Jia

    Abstract: The widespread usage of large-scale multimodal models like CLIP has heightened concerns about the leakage of personally identifiable information (PII). Existing methods for identity inference in CLIP models, i.e., to detect the presence of a person's PII used for training a CLIP model, require querying the model with full PII, including textual descriptions of the person and corresponding images (… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  8. arXiv:2405.12183  [pdf, other

    cs.LG cs.AI

    Multi-order Graph Clustering with Adaptive Node-level Weight Learning

    Authors: Ye Liu, Xuelei Lin, Yejia Chen, Reynold Cheng

    Abstract: Current graph clustering methods emphasize individual node and edge con nections, while ignoring higher-order organization at the level of motif. Re cently, higher-order graph clustering approaches have been designed by motif based hypergraphs. However, these approaches often suffer from hypergraph fragmentation issue seriously, which degrades the clustering performance greatly. Moreover, real-wor… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  9. arXiv:2405.10422  [pdf, other

    cs.NI

    A First Look at Immersive Telepresence on Apple Vision Pro

    Authors: Ruizhi Cheng, Nan Wu, Matteo Varvello, Eugene Chai, Songqing Chen, Bo Han

    Abstract: Due to the widespread adoption of "work-from-home" policies, videoconferencing applications (e.g., Zoom) have become indispensable for remote communication. However, these systems lack immersiveness, leading to the so-called "Zoom fatigue" and degrading communication efficiency. The recent debut of Apple Vision Pro, a mixed reality headset that supports "spatial persona", aims to offer an immersiv… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  10. arXiv:2405.03267  [pdf, other

    cs.DC cs.DB cs.IR

    Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory

    Authors: Rongxin Cheng, Yifan Peng, Xingda Wei, Hongrui Xie, Rong Chen, Sijie Shen, Haibo Chen

    Abstract: Vector searches on large-scale datasets are critical to modern online services like web search and RAG, which necessity storing the datasets and their index on the secondary storage like SSD. In this paper, we are the first to characterize the trade-off of performance and index size in existing SSD-based graph and cluster indexes: to improve throughput by 5.7$\times$ and 1.7$\times$, these indexes… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  11. A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation

    Authors: Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng

    Abstract: As one of the emerging challenges in Automated Machine Learning, the Hardware-aware Neural Architecture Search (HW-NAS) tasks can be treated as black-box multi-objective optimization problems (MOPs). An important application of HW-NAS is real-time semantic segmentation, which plays a pivotal role in autonomous driving scenarios. The HW-NAS for real-time semantic segmentation inherently needs to ba… ▽ More

    Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: GECCO 2024

  12. arXiv:2404.15622  [pdf, other

    cs.LG

    FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search

    Authors: Haoming Zhang, Ran Cheng

    Abstract: Neural Architecture Search (NAS) has emerged as a key tool in identifying optimal configurations of deep neural networks tailored to specific tasks. However, training and assessing numerous architectures introduces considerable computational overhead. One method to mitigating this is through performance predictors, which offer a means to estimate the potential of an architecture without exhaustive… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: IJCNN'24

  13. arXiv:2404.10160  [pdf, other

    cs.AI

    Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs

    Authors: Ruoxi Cheng, Haoxuan Ma, Shuirong Cao, Jiaqi Li, Aihua Pei, Zhiqiang Wang, Pengliang Ji, Haoyu Wang, Jiaqi Huo

    Abstract: Bias in LLMs can harm user experience and societal outcomes. However, current bias mitigation methods often require intensive human feedback, lack transferability to other topics or yield overconfident and random outputs. We find that involving LLMs in role-playing scenario boosts their ability to recognize and mitigate biases. Based on this, we propose Reinforcement Learning from Multi-role Debat… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: The first three authors contributed equally to this work

  14. arXiv:2404.08233  [pdf, other

    cs.LG cs.AI cs.NE

    Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning

    Authors: Hui Bai, Ran Cheng

    Abstract: Hyperparameter optimization plays a key role in the machine learning domain. Its significance is especially pronounced in reinforcement learning (RL), where agents continuously interact with and adapt to their environments, requiring dynamic adjustments in their learning trajectories. To cater to this dynamicity, the Population-Based Training (PBT) was introduced, leveraging the collective intelli… ▽ More

    Submitted 22 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: IEEE Transactions on Emerging Topics in Computational Intelligence

  15. arXiv:2404.07387  [pdf, other

    cs.HC cs.AI

    BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks

    Authors: Ruijia Cheng, Titus Barik, Alan Leung, Fred Hohman, Jeffrey Nichols

    Abstract: Novices frequently engage with machine learning tutorials in computational notebooks and have been adopting code generation technologies based on large language models (LLMs). However, they encounter difficulties in understanding and working with code produced by LLMs. To mitigate these challenges, we introduce a novel workflow into computational notebooks that augments LLM-based code generation w… ▽ More

    Submitted 12 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  16. arXiv:2404.06290  [pdf, other

    cs.NE

    Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models

    Authors: Beichen Huang, Xingyu Wu, Yu Zhou, Jibin Wu, Liang Feng, Ran Cheng, Kay Chen Tan

    Abstract: Large language models (LLMs) have gained widespread popularity and demonstrated exceptional performance not only in natural language processing (NLP) tasks but also in non-linguistic domains. Their potential as artificial general intelligence extends beyond NLP, showcasing promising capabilities in diverse optimization scenarios. Despite this rising trend, whether the integration of LLMs into thes… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  17. arXiv:2404.04895  [pdf, other

    cs.NE

    Tensorized Ant Colony Optimization for GPU Acceleration

    Authors: Luming Yang, Tao Jiang, Ran Cheng

    Abstract: Ant Colony Optimization (ACO) is renowned for its effectiveness in solving Traveling Salesman Problems, yet it faces computational challenges in CPU-based environments, particularly with large-scale instances. In response, we introduce a Tensorized Ant Colony Optimization (TensorACO) to utilize the advancements of GPU acceleration. As the core, TensorACO fully transforms ant system and ant path in… ▽ More

    Submitted 12 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

    Comments: Genetic and Evolutionary Computation Conference (GECCO '24)

  18. arXiv:2404.01817  [pdf, other

    cs.NE

    Tensorized NeuroEvolution of Augmenting Topologies for GPU Acceleration

    Authors: Lishuang Wang, Mengfei Zhao, Enyu Liu, Kebin Sun, Ran Cheng

    Abstract: The NeuroEvolution of Augmenting Topologies (NEAT) algorithm has received considerable recognition in the field of neuroevolution. Its effectiveness is derived from initiating with simple networks and incrementally evolving both their topologies and weights. Although its capability across various challenges is evident, the algorithm's computational efficiency remains an impediment, limiting its sc… ▽ More

    Submitted 11 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Genetic and Evolutionary Computation Conference (GECCO '24)

  19. GPU-accelerated Evolutionary Multiobjective Optimization Using Tensorized RVEA

    Authors: Zhenyu Liang, Tao Jiang, Kebin Sun, Ran Cheng

    Abstract: Evolutionary multiobjective optimization has witnessed remarkable progress during the past decades. However, existing algorithms often encounter computational challenges in large-scale scenarios, primarily attributed to the absence of hardware acceleration. In response, we introduce a Tensorized Reference Vector Guided Evolutionary Algorithm (TensorRVEA) for harnessing the advancements of GPU acce… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Genetic and Evolutionary Computation Conference (GECCO '24)

  20. arXiv:2403.13286  [pdf, other

    stat.ML cs.DB cs.LG

    A Sampling-based Framework for Hypothesis Testing on Large Attributed Graphs

    Authors: Yun Wang, Chrysanthi Kosyfaki, Sihem Amer-Yahia, Reynold Cheng

    Abstract: Hypothesis testing is a statistical method used to draw conclusions about populations from sample data, typically represented in tables. With the prevalence of graph representations in real-life applications, hypothesis testing in graphs is gaining importance. In this work, we formalize node, edge, and path hypotheses in attributed graphs. We develop a sampling-based hypothesis testing framework,… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  21. arXiv:2403.11073  [pdf

    cs.CV cs.AI

    Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping

    Authors: Haoxi Zhang, Xinxu Zhang, Yuanxin Lin, Maiqi Wang, Yi Lai, Yu Wang, Linfeng Yu, Yufeng Xu, Ran Cheng, Edward Szczerbicki

    Abstract: Automatic karyotype analysis is often defined as a visual perception task focused solely on chromosomal object-level modeling. This definition has led most existing methods to overlook componential and holistic information, significantly constraining model performance. Moreover, the lack of interpretability in current technologies hinders clinical adoption. In this paper, we introduce Tokensome, a… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Preprint. Work in progress

  22. arXiv:2403.08600  [pdf

    cs.NI

    Evaluation of Control/User-Plane Denial-of-Service (DoS) Attack on O-RAN Fronthaul Interface

    Authors: Ferlinda Feliana, Ting-Wei Hung, Binbin Chen, Ray-Guang Cheng

    Abstract: The open fronthaul interface defined by O-RAN ALLIANCE aims to support the interoperability between multi-vendor open radio access network (O-RAN) radio units (O-RU) and O-RAN distributed units (O-DU). This paper introduces a new tool that could be used to evaluate Denial-of-Service (DoS) attacks against the open fronthaul interface. We launched an array of control/user planes (C/U-Planes) attacks… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE INFOCOM Workshop: Next-generation Open and Programmable Radio Access Networks (NG-OPERA)

  23. arXiv:2403.05680  [pdf, other

    cs.AI cs.CL cs.CV

    How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses

    Authors: Qingqing Zhu, Benjamin Hou, Tejas S. Mathai, Pritam Mukherjee, Qiao Jin, Xiuying Chen, Zhizheng Wang, Ruida Cheng, Ronald M. Summers, Zhiyong Lu

    Abstract: Automatically interpreting CT scans can ease the workload of radiologists. However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named ``GPTRadScore''. This framework assesses the capabilities of multi-modal LLMs, such as GPT-4 with Vision (GPT-4V), Gemini… ▽ More

    Submitted 18 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  24. arXiv:2403.05307  [pdf, other

    cs.AI

    Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

    Authors: Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng

    Abstract: Interactive Data Analysis, the collaboration between humans and LLM agents, enables real-time data exploration for informed decision-making. The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task. To mitigate this issue, we introduce Tapilot-Crossing, a new benchmark to evaluate LLM ag… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 30 pages, 7 figures

  25. arXiv:2403.04796  [pdf, other

    cs.CR eess.SY

    Blockchain-Enhanced UAV Networks for Post-Disaster Communication: A Decentralized Flocking Approach

    Authors: Sana Hafeez, Runze Cheng, Lina Mohjazi, Yao Sun, Muhammad Ali Imran

    Abstract: Unmanned Aerial Vehicles (UAVs) have significant potential for agile communication and relief coordination in post-disaster scenarios, particularly when ground infrastructure is compromised. However, efficiently coordinating and securing flocks of heterogeneous UAVs from different service providers poses significant challenges related to privacy, scalability, lightweight consensus protocols, and c… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 11 pages, 9 figures, Digital Communications and Networks Open access

  26. arXiv:2402.17237  [pdf, other

    cs.CV cs.CL

    Image-Text Matching with Multi-View Attention

    Authors: Rui Cheng, Wanqing Cui

    Abstract: Existing two-stream models for image-text matching show good performance while ensuring retrieval speed and have received extensive attention from industry and academia. These methods use a single representation to encode image and text separately and get a matching score with cosine similarity or the inner product of vectors. However, the performance of the two-stream model is often sub-optimal.… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  27. arXiv:2402.15331  [pdf, other

    cs.CR eess.SY

    A Blockchain-Enabled Framework of UAV Coordination for Post-Disaster Networks

    Authors: Sana Hafeez, Runze Cheng, Lina Mohjazi, Muhammad Ali Imran, Yao Sun

    Abstract: Emergency communication is critical but challenging after natural disasters when ground infrastructure is devastated. Unmanned aerial vehicles (UAVs) offer enormous potential for agile relief coordination in these scenarios. However, effectively leveraging UAV fleets poses additional challenges around security, privacy, and efficient collaboration across response agencies. This paper presents a ro… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 6 pages, 4 figures,IEEE 99th Vehicular Technology Conference: VTC2024-Spring, Singapore

  28. arXiv:2402.13116  [pdf, other

    cs.CL

    A Survey on Knowledge Distillation of Large Language Models

    Authors: Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, Dacheng Tao, Tianyi Zhou

    Abstract: In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral. Additionally, as open-source LLMs flourish, KD plays a crucial role in both compressing these models, and facilitating their self-improvement by employi… ▽ More

    Submitted 8 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 44 pages

  29. Debiasing Recommendation with Personal Popularity

    Authors: Wentao Ning, Reynold Cheng, Xiao Yan, Ben Kao, Nan Huo, Nur AI Hasan Haldar, Bo Tang

    Abstract: Global popularity (GP) bias is the phenomenon that popular items are recommended much more frequently than they should be, which goes against the goal of providing personalized recommendations and harms user experience and recommendation accuracy. Many methods have been proposed to reduce GP bias but they fail to notice the fundamental problem of GP, i.e., it considers popularity from a \textit{gl… ▽ More

    Submitted 21 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW'24 as a research full paper

  30. arXiv:2402.06071  [pdf, other

    cs.HC

    Keyframer: Empowering Animation Design using Large Language Models

    Authors: Tiffany Tseng, Ruijia Cheng, Jeffrey Nichols

    Abstract: Large language models (LLMs) have the potential to impact a wide range of creative domains, but the application of LLMs to animation is underexplored and presents novel challenges such as how users might effectively describe motion in natural language. In this paper, we present Keyframer, a design tool for animating static images (SVGs) with natural language. Informed by interviews with profession… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  31. arXiv:2402.03362  [pdf, other

    cs.IR cs.AI cs.CL

    NanoNER: Named Entity Recognition for nanobiology using experts' knowledge and distant supervision

    Authors: Martin Lentschat, Cyril Labbé, Ran Cheng

    Abstract: Here we present the training and evaluation of NanoNER, a Named Entity Recognition (NER) model for Nanobiology. NER consists in the identification of specific entities in spans of unstructured texts and is often a primary task in Natural Language Processing (NLP) and Information Extraction. The aim of our model is to recognise entities previously identified by domain experts as constituting the es… ▽ More

    Submitted 30 January, 2024; originally announced February 2024.

  32. Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach

    Authors: Max Bajracharya, James Borders, Richard Cheng, Dan Helmick, Lukas Kaul, Dan Kruse, John Leichty, Jeremy Ma, Carolyn Matl, Frank Michel, Chavdar Papazov, Josh Petersen, Krishna Shankar, Mark Tjersland

    Abstract: We present our general-purpose mobile manipulation system consisting of a custom robot platform and key algorithms spanning perception and planning. To extensively test the system in the wild and benchmark its performance, we choose a grocery shopping scenario in an actual, unmodified grocery store. We derive key performance metrics from detailed robot log data collected during six week-long field… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Presented at RSS 2023 [Best Demo Paper Award]

  33. arXiv:2312.10890  [pdf, other

    cs.CV cs.GR

    Low-latency Space-time Supersampling for Real-time Rendering

    Authors: Ruian He, Shili Zhou, Yuqi Sun, Ri Cheng, Weimin Tan, Bo Yan

    Abstract: With the rise of real-time rendering and the evolution of display devices, there is a growing demand for post-processing methods that offer high-resolution content in a high frame rate. Existing techniques often suffer from quality and latency issues due to the disjointed treatment of frame supersampling and extrapolation. In this paper, we recognize the shared context and mechanisms between frame… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  34. arXiv:2312.07180  [pdf, other

    cs.CV

    Context-Aware Iteration Policy Network for Efficient Optical Flow Estimation

    Authors: Ri Cheng, Ruian He, Xuhao Jiang, Shili Zhou, Weimin Tan, Bo Yan

    Abstract: Existing recurrent optical flow estimation networks are computationally expensive since they use a fixed large number of iterations to update the flow field for each sample. An efficient network should skip iterations when the flow improvement is limited. In this paper, we develop a Context-Aware Iteration Policy Network for efficient optical flow estimation, which determines the optimal number of… ▽ More

    Submitted 5 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 2024, Association for the Advancement of Artificial Intelligence

  35. arXiv:2310.17705  [pdf, other

    cs.NI cs.AI cs.LG eess.IV

    A Wireless AI-Generated Content (AIGC) Provisioning Framework Empowered by Semantic Communication

    Authors: Runze Cheng, Yao Sun, Dusit Niyato, Lan Zhang, Lei Zhang, Muhammad Ali Imran

    Abstract: Generative AI applications have been recently catering to a vast user base by creating diverse and high-quality AI-generated content (AIGC). With the proliferation of mobile devices and rapid growth of mobile traffic, providing ubiquitous access to high-quality AIGC services via wireless communication networks is becoming the future direction. However, it is challenging to provide qualified AIGC s… ▽ More

    Submitted 29 May, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  36. arXiv:2310.16951  [pdf, other

    cs.RO

    The Teenager's Problem: Efficient Garment Decluttering With Grasp Optimization

    Authors: Aviv Adler, Ayah Ahmad, Shengyin Wang, Wisdom C. Agboh, Edith Llontop, Tianshuang Qiu, Jeffrey Ichnowski, Mehmet Dogar, Thomas Kollar, Richard Cheng, Ken Goldberg

    Abstract: This paper addresses the ''Teenager's Problem'': efficiently removing scattered garments from a planar surface. As grasping and transporting individual garments is highly inefficient, we propose analytical policies to select grasp locations for multiple garments using an overhead camera. Two classes of methods are considered: depth-based, which use overhead depth data to find efficient grasps, and… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  37. arXiv:2310.09690  [pdf, other

    cs.SE cs.AI cs.OS

    Configuration Validation with Large Language Models

    Authors: Xinyu Lian, Yinfang Chen, Runxiang Cheng, Jie Huang, Parth Thakkar, Minjia Zhang, Tianyin Xu

    Abstract: Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models. Recent advances in Large Language Model… ▽ More

    Submitted 2 April, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

  38. arXiv:2310.04069  [pdf, other

    cs.DB

    Spatio-temporal flow patterns

    Authors: Chrysanthi Kosyfaki, Nikos Mamoulis, Reynold Cheng, Ben Kao

    Abstract: Transportation companies and organizations routinely collect huge volumes of passenger transportation data. By aggregating these data (e.g., counting the number of passengers going from a place to another in every 30 minute interval), it becomes possible to analyze the movement behavior of passengers in a metropolitan area. In this paper, we study the problem of finding important trends in passeng… ▽ More

    Submitted 12 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  39. AXNav: Replaying Accessibility Tests from Natural Language

    Authors: Maryam Taeb, Amanda Swearngin, Eldon Schoop, Ruijia Cheng, Yue Jiang, Jeffrey Nichols

    Abstract: Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs, however t… ▽ More

    Submitted 4 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted into Conference on Human Factors in Computing Systems (CHI) 2024, 22 pages, 7 figures

    ACM Class: I.2

  40. arXiv:2308.08856  [pdf, other

    cs.CV

    MV-ROPE: Multi-view Constraints for Robust Category-level Object Pose and Size Estimation

    Authors: Jiaqi Yang, Yucong Chen, Xiangting Meng, Chenxin Yan, Min Li, Ran Cheng, Lige Liu, Tao Sun, Laurent Kneip

    Abstract: Recently there has been a growing interest in category-level object pose and size estimation, and prevailing methods commonly rely on single view RGB-D images. However, one disadvantage of such methods is that they require accurate depth maps which cannot be produced by consumer-grade sensors. Furthermore, many practical real-world situations involve a moving camera that continuously observes its… ▽ More

    Submitted 22 March, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

  41. arXiv:2308.06515  [pdf, other

    cs.CV cs.DC

    Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

    Authors: Zhichao Lu, Chuntao Ding, Shangguang Wang, Ran Cheng, Felix Juefei-Xu, Vishnu Naresh Boddeti

    Abstract: Deploying high-performance convolutional neural network (CNN) models on low-earth orbit (LEO) satellites for rapid remote sensing image processing has attracted significant interest from industry and academia. However, the limited resources available on LEO satellites contrast with the demands of resource-intensive CNN models, necessitating the adoption of ground-station server assistance for trai… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 11 pages

  42. arXiv:2308.05640  [pdf, other

    cs.NE

    A Comparative Visual Analytics Framework for Evaluating Evolutionary Processes in Multi-objective Optimization

    Authors: Yansong Huang, Zherui Zhang, Ao Jiao, Yuxin Ma, Ran Cheng

    Abstract: Evolutionary multi-objective optimization (EMO) algorithms have been demonstrated to be effective in solving multi-criteria decision-making problems. In real-world applications, analysts often employ several algorithms concurrently and compare their solution sets to gain insight into the characteristics of different algorithms and explore a broader range of feasible solutions. However, EMO algorit… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE VIS 2023 (will appear in IEEE TVCG)

  43. Multi-domain Recommendation with Embedding Disentangling and Domain Alignment

    Authors: Wentao Ning, Xiao Yan, Weiwen Liu, Reynold Cheng, Rui Zhang, Bo Tang

    Abstract: Multi-domain recommendation (MDR) aims to provide recommendations for different domains (e.g., types of products) with overlapping users/items and is common for platforms such as Amazon, Facebook, and LinkedIn that host multiple services. Existing MDR models face two challenges: First, it is difficult to disentangle knowledge that generalizes across domains (e.g., a user likes cheap items) and kno… ▽ More

    Submitted 13 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM'23 as a Long paper

  44. arXiv:2308.02066  [pdf, other

    cs.CV cs.AI cs.LG

    Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives

    Authors: Chuntao Ding, Zhichao Lu, Shangguang Wang, Ran Cheng, Vishnu Naresh Boddeti

    Abstract: Multi-task learning (MTL) seeks to learn a single model to accomplish multiple tasks by leveraging shared information among the tasks. Existing MTL models, however, have been known to suffer from negative interference among tasks. Efforts to mitigate task interference have focused on either loss/gradient balancing or implicit parameter partitioning with partial overlaps among the tasks. In this pa… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: CVPR 2023

  45. Uncertainty-Guided Spatial Pruning Architecture for Efficient Frame Interpolation

    Authors: Ri Cheng, Xuhao Jiang, Ruian He, Shili Zhou, Weimin Tan, Bo Yan

    Abstract: The video frame interpolation (VFI) model applies the convolution operation to all locations, leading to redundant computations in regions with easy motion. We can use dynamic spatial pruning method to skip redundant computation, but this method cannot properly identify easy regions in VFI tasks without supervision. In this paper, we develop an Uncertainty-Guided Spatial Pruning (UGSP) architectur… ▽ More

    Submitted 27 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: ACM Multimedia 2023

  46. Automotive Object Detection via Learning Sparse Events by Spiking Neurons

    Authors: Hu Zhang, Yanchen Li, Luziwei Leng, Kaiwei Che, Qian Liu, Qinghai Guo, Jianxing Liao, Ran Cheng

    Abstract: Event-based sensors, distinguished by their high temporal resolution of 1 $\mathrmμ\text{s}$ and a dynamic range of 120 $\text{dB}$, stand out as ideal tools for deployment in fast-paced settings like vehicles and drones. Traditional object detection techniques that utilize Artificial Neural Networks (ANNs) face challenges due to the sparse and asynchronous nature of the events these sensors captu… ▽ More

    Submitted 10 June, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: IEEE Transactions on Cognitive and Developmental Systems

  47. arXiv:2307.12326  [pdf, other

    cs.RO

    Scale jump-aware pose graph relaxation for monocular SLAM with re-initializations

    Authors: Runze Yuan, Ran Cheng, Lige Liu, Tao Sun, Laurent Kneip

    Abstract: Pose graph relaxation has become an indispensable addition to SLAM enabling efficient global registration of sensor reference frames under the objective of satisfying pair-wise relative transformation constraints. The latter may be given by incremental motion estimation or global place recognition. While the latter case enables loop closures and drift compensation, care has to be taken in the mono… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

    Comments: 8 pages, 23 figures, International Conference on Intelligent Robots and Systems 2023

  48. arXiv:2307.05717  [pdf, other

    cs.OH

    Towards Mobility Data Science (Vision Paper)

    Authors: Mohamed Mokbel, Mahmoud Sakr, Li Xiong, Andreas Züfle, Jussara Almeida, Taylor Anderson, Walid Aref, Gennady Andrienko, Natalia Andrienko, Yang Cao, Sanjay Chawla, Reynold Cheng, Panos Chrysanthis, Xiqi Fei, Gabriel Ghinita, Anita Graser, Dimitrios Gunopulos, Christian Jensen, Joon-Seok Kim, Kyoung-Sook Kim, Peer Kröger, John Krumm, Johannes Lauer, Amr Magdy, Mario Nascimento , et al. (23 additional authors not shown)

    Abstract: Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences… ▽ More

    Submitted 7 March, 2024; v1 submitted 21 June, 2023; originally announced July 2023.

    Comments: Updated to reflect the major revision for ACM Transactions on Spatial Algorithms and Systems (TSAS). This version reflects the final version accepted by ACM TSAS

  49. Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

    Authors: Boyan Li, Luziwei Leng, Shuaijie Shen, Kaixuan Zhang, Jianguo Zhang, Jianxing Liao, Ran Cheng

    Abstract: Advancements in adapting deep convolution architectures for Spiking Neural Networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. However, the inability of Multiplication-Free Inference (MFI) to align with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposing limitations on… ▽ More

    Submitted 26 April, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: IEEE TNNLS

  50. arXiv:2306.12098  [pdf, other

    eess.SP cs.LG

    MSW-Transformer: Multi-Scale Shifted Windows Transformer Networks for 12-Lead ECG Classification

    Authors: Renjie Cheng, Zhemin Zhuang, Shuxin Zhuang, Lei Xie, Jingfeng Guo

    Abstract: Automatic classification of electrocardiogram (ECG) signals plays a crucial role in the early prevention and diagnosis of cardiovascular diseases. While ECG signals can be used for the diagnosis of various diseases, their pathological characteristics exhibit minimal variations, posing a challenge to automatic classification models. Existing methods primarily utilize convolutional neural networks t… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.