Skip to main content

Showing 1–50 of 403 results for author: Yue, Y

  1. arXiv:2407.11840  [pdf, other

    cs.CV

    MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification

    Authors: Zhuoxiao Li, Shanliang Yao, Yijie Chu, Angel F. Garcia-Fernandez, Yong Yue, Eng Gee Lim, Xiaohui Zhu

    Abstract: In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Additionally, unreliable densification processes and th… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: https://mvgsplatting.github.io

  2. arXiv:2407.10540  [pdf, other

    astro-ph.HE

    Sudden polarization angle jumps of the repeating fast radio burst FRB 20201124A

    Authors: J. R. Niu, W. Y. Wang, J. C. Jiang, Y. Qu, D. J. Zhou, W. W. Zhu, K. J. Lee, J. L. Han, B. Zhang, D. Li, S. Cao, Z. Y. Fang, Y. Feng, Q. Y. Fu, P. Jiang, W. C. Jing, J. Li, Y. Li, R. Luo, L. Q. Meng, C. C. Miao, X. L. Miao, C. H. Niu, Y. C. Pan, B. J. Wang , et al. (19 additional authors not shown)

    Abstract: We report the first detection of polarization angle (PA) orthogonal jumps, a phenomenon previously only observed from radio pulsars, from a fast radio burst (FRB) source FRB 20201124A. We find three cases of orthogonal jumps in over two thousand bursts, all resembling those observed in pulsar single pulses. We propose that the jumps are due to the superposition of two orthogonal emission modes tha… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 10 pages, 5 figures, submitted to APJL

  3. arXiv:2407.08770  [pdf, other

    cs.AI

    Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

    Authors: Huanqian Wang, Yang Yue, Rui Lu, Jingxin Shi, Andrew Zhao, Shenzhi Wang, Shiji Song, Gao Huang

    Abstract: Large Language Models (LLMs) have demonstrated great potential as generalist assistants, showcasing powerful task understanding and problem-solving capabilities. To deploy LLMs as AI assistants, it is crucial that these models exhibit desirable behavioral traits, such as non-toxicity and resilience against jailbreak attempts. Current methods for detoxification or preventing jailbreaking usually in… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 23 pages, 14 figures

    MSC Class: 68T50 (Primary) 68T07; 62M45 (Secondary) ACM Class: I.2.7

  4. arXiv:2407.03993  [pdf, other

    cs.CL

    A Survey on Natural Language Counterfactual Generation

    Authors: Yongjie Wang, Xiaoqi Qiu, Yu Yue, Xu Guo, Zhiwei Zeng, Yuhong Feng, Zhiqi Shen

    Abstract: Natural Language Counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues or augment the training d… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: A survey paper

    MSC Class: 68T50 ACM Class: I.2.7

  5. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  6. arXiv:2407.01887  [pdf, other

    cs.LG cs.AI cs.CL

    Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents

    Authors: Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li

    Abstract: In-context decision-making is an important capability of artificial general intelligence, which Large Language Models (LLMs) have effectively demonstrated in various scenarios. However, LLMs often face challenges when dealing with numerical contexts, and limited attention has been paid to evaluating their performance through preference feedback generated by the environment. This paper investigates… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2406.17530  [pdf, other

    cs.CV cs.RO

    Point Tree Transformer for Point Cloud Registration

    Authors: Meiling Wang, Guangyan Chen, Yi Yang, Li Yuan, Yufeng Yue

    Abstract: Point cloud registration is a fundamental task in the fields of computer vision and robotics. Recent developments in transformer-based methods have demonstrated enhanced performance in this domain. However, the standard attention mechanism utilized in these methods often integrates many low-relevance points, thereby struggling to prioritize its attention weights on sparse yet meaningful points. Th… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  8. arXiv:2406.15788  [pdf, other

    cs.LG

    Distributionally Robust Constrained Reinforcement Learning under Strong Duality

    Authors: Zhengfei Zhang, Kishan Panaganti, Laixi Shi, Yanan Sui, Adam Wierman, Yisong Yue

    Abstract: We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal is to maximize the expected reward subject to environmental distribution shifts and constraints. This setting captures situations where training and testing environments differ, and policies must satisfy constraints motivated by safety or limited budgets. Despite significant progress toward algorithm design for… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted at the Reinforcement Learning Conference (RLC) 2024; 28 pages, 4 figures

  9. arXiv:2406.15669  [pdf, other

    q-bio.BM

    CARE: a Benchmark Suite for the Classification and Retrieval of Enzymes

    Authors: Jason Yang, Ariane Mora, Shengchao Liu, Bruce J. Wittmann, Anima Anandkumar, Frances H. Arnold, Yisong Yue

    Abstract: Enzymes are important proteins that catalyze chemical reactions. In recent years, machine learning methods have emerged to predict enzyme function from sequence; however, there are no standardized benchmarks to evaluate these methods. We introduce CARE, a benchmark and dataset suite for the Classification And Retrieval of Enzymes (CARE). CARE centers on two tasks: (1) classification of a protein s… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  10. arXiv:2406.14699  [pdf, other

    cs.LG math.OC stat.ML

    Preferential Multi-Objective Bayesian Optimization

    Authors: Raul Astudillo, Kejun Li, Maegan Tucker, Chu Xin Cheng, Aaron D. Ames, Yisong Yue

    Abstract: Preferential Bayesian optimization (PBO) is a framework for optimizing a decision-maker's latent preferences over available design choices. While preferences often involve multiple conflicting objectives, existing work in PBO assumes that preferences can be encoded by a single objective function. For example, in robotic assistive devices, technicians often attempt to maximize user comfort while si… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  11. arXiv:2406.13103  [pdf, other

    cs.AI cs.LG

    A Generic Method for Fine-grained Category Discovery in Natural Language Texts

    Authors: Chang Tian, Matthew B. Blaschko, Wenpeng Yin, Mingzhe Xing, Yinliang Yue, Marie-Francine Moens

    Abstract: Fine-grained category discovery using only coarse-grained supervision is a cost-effective yet challenging task. Previous training methods focus on aligning query samples with positive samples and distancing them from negatives. They often neglect intra-category and inter-category semantic similarities of fine-grained categories when navigating sample distributions in the embedding space. Furthermo… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: preprint

  12. arXiv:2406.11193  [pdf, other

    cs.CL

    MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model

    Authors: Jiahao Huo, Yibo Yan, Boren Hu, Yutao Yue, Xuming Hu

    Abstract: Projecting visual features into word embedding space has become a significant fusion strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal mechanisms have yet to be explored. Inspired by multilingual research, we identify domain-specific neurons in multimodal large language models. Specifically, we investigate the distribution of domain-specific neurons and the mechan… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  13. arXiv:2406.08009  [pdf, other

    cs.CV cs.AI cs.RO

    OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding

    Authors: Yinan Deng, Jiahui Wang, Jingyu Zhao, Jianyu Dou, Yi Yang, Yufeng Yue

    Abstract: In recent years, there has been a surge of interest in open-vocabulary 3D scene reconstruction facilitated by visual language models (VLMs), which showcase remarkable capabilities in open-set retrieval. However, existing methods face some limitations: they either focus on learning point-wise features, resulting in blurry semantic understanding, or solely tackle object-level reconstruction, thereby… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 8 pages, 7figures. Project Url: https://openobj.github.io/

  14. arXiv:2406.03816  [pdf, other

    cs.CL

    ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search

    Authors: Dan Zhang, Sining Zhoubian, Yisong Yue, Yuxiao Dong, Jie Tang

    Abstract: Recent methodologies in LLM self-training mostly rely on LLM generating responses and filtering those with correct output answers as training data. This approach often yields a low-quality fine-tuning training set (e.g., incorrect plans or intermediate reasoning). In this paper, we develop a reinforced self-training approach, called ReST-MCTS*, based on integrating process reward guidance with tre… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 29 pages

  15. arXiv:2406.03044  [pdf, other

    cs.LG q-bio.NC

    Population Transformer: Learning Population-level Representations of Intracranial Activity

    Authors: Geeling Chau, Christopher Wang, Sabera Talukder, Vighnesh Subramaniam, Saraswati Soedarmadji, Yisong Yue, Boris Katz, Andrei Barbu

    Abstract: We present a self-supervised framework that learns population-level codes for intracranial neural recordings at scale, unlocking the benefits of representation learning for a key neuroscience recording modality. The Population Transformer (PopT) lowers the amount of data required for decoding experiments, while increasing accuracy, even on never-before-seen subjects and tasks. We address two key c… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 17 pages, 10 figures, submitted to NeurIPS 2024

  16. arXiv:2406.02721  [pdf, other

    cs.CL cs.AI

    Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller

    Authors: Min Cai, Yuchen Zhang, Shichang Zhang, Fan Yin, Difan Zou, Yisong Yue, Ziniu Hu

    Abstract: We propose Self-Control, a novel method utilizing suffix gradients to control the behavior of large language models (LLMs) without explicit human annotations. Given a guideline expressed in suffix string and the model's self-assessment of adherence, Self-Control computes the gradient of this self-judgment concerning the model's hidden states, directly influencing the auto-regressive generation pro… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 41 pages, 12 figures, 41 tables; Website: https://llm-self-control.github.io/

  17. arXiv:2405.20561  [pdf, other

    cs.CR cs.SE

    All Your Tokens are Belong to Us: Demystifying Address Verification Vulnerabilities in Solidity Smart Contracts

    Authors: Tianle Sun, Ningyu He, Jiang Xiao, Yinliang Yue, Xiapu Luo, Haoyu Wang

    Abstract: In Ethereum, the practice of verifying the validity of the passed addresses is a common practice, which is a crucial step to ensure the secure execution of smart contracts. Vulnerabilities in the process of address verification can lead to great security issues, and anecdotal evidence has been reported by our community. However, this type of vulnerability has not been well studied. To fill the voi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by USENIX Security 2024

  18. arXiv:2405.19730  [pdf

    cs.AI cs.CV cs.LG

    Research on Foundation Model for Spatial Data Intelligence: China's 2024 White Paper on Strategic Development of Spatial Data Intelligence

    Authors: Shaohua Wang, Xing Xie, Yong Li, Danhuai Guo, Zhi Cai, Yu Liu, Yang Yue, Xiao Pan, Feng Lu, Huayi Wu, Zhipeng Gui, Zhiming Ding, Bolong Zheng, Fuzheng Zhang, Tao Qin, Jingyuan Wang, Chuang Tao, Zhengchao Chen, Hao Lu, Jiayi Li, Hongyang Chen, Peng Yue, Wenhao Yu, Yao Yao, Leilei Sun , et al. (9 additional authors not shown)

    Abstract: This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial dat… ▽ More

    Submitted 29 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: in Chinese language

  19. arXiv:2405.19647  [pdf, other

    cs.LG

    FTS: A Framework to Find a Faithful TimeSieve

    Authors: Songning Lai, Ninghui Feng, Haochen Sui, Ze Ma, Hao Wang, Zichen Song, Hang Zhao, Yutao Yue

    Abstract: The field of time series forecasting has garnered significant attention in recent years, prompting the development of advanced models like TimeSieve, which demonstrates impressive performance. However, an analysis reveals certain unfaithfulness issues, including high sensitivity to random seeds and minute input noise perturbations. Recognizing these challenges, we embark on a quest to define the c… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Journal ref: IJCAI2024 workshop

  20. arXiv:2405.18782  [pdf, other

    eess.IV cs.CV stat.ML

    Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors

    Authors: Zihui Wu, Yu Sun, Yifan Chen, Bingliang Zhang, Yisong Yue, Katherine L. Bouman

    Abstract: Diffusion models (DMs) have recently shown outstanding capability in modeling complex image distributions, making them expressive image priors for solving Bayesian inverse problems. However, most existing DM-based methods rely on approximations in the generative process to be generic to different inverse problems, leading to inaccurate sample distributions that deviate from the target posterior de… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  21. arXiv:2405.16464  [pdf, other

    cs.RO cs.CV

    Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge

    Authors: Tianchen Deng, Yi Zhou, Wenhua Wu, Mingrui Li, Jingwei Huang, Shuhong Liu, Yanzeng Song, Hao Zuo, Yanbo Wang, Yutao Yue, Hesheng Wang, Weidong Chen

    Abstract: This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection, UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024 workshop. The 1st winning model in CVPR 2024 UG2+ challenge. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV

  22. arXiv:2405.14260  [pdf, other

    cs.LG cs.AI

    Graph Sparsification via Mixture of Graphs

    Authors: Guibin Zhang, Xiangguo Sun, Yanwei Yue, Kun Wang, Tianlong Chen, Shirui Pan

    Abstract: Graph Neural Networks (GNNs) have demonstrated superior performance across various graph learning tasks but face significant computational challenges when applied to large-scale graphs. One effective approach to mitigate these challenges is graph sparsification, which involves removing non-essential edges to reduce computational overhead. However, previous graph sparsification methods often rely o… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  23. arXiv:2405.13448  [pdf, other

    cs.CL

    Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning

    Authors: Yuanhao Yue, Chengyu Wang, Jun Huang, Peng Wang

    Abstract: The process of instruction tuning aligns pre-trained large language models (LLMs) with open-domain instructions and human-preferred responses. While several studies have explored autonomous approaches to distilling and annotating instructions from more powerful proprietary LLMs, such as ChatGPT, they often neglect the impact of task distributions and the varying difficulty of instructions of the t… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  24. arXiv:2405.12821  [pdf, other

    cs.RO cs.CV

    Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

    Authors: Runwei Guan, Ruixiao Zhang, Ningwei Ouyang, Jianan Liu, Ka Lok Man, Xiaohao Cai, Ming Xu, Jeremy Smith, Eng Gee Lim, Yutao Yue, Hui Xiong

    Abstract: Embodied perception is essential for intelligent vehicles and robots, enabling more natural interaction and task execution. However, these advancements currently embrace vision level, rarely focusing on using 3D modeling sensors, which limits the full understanding of surrounding objects with multi-granular characteristics. Recently, as a promising automotive sensor with affordable cost, 4D Millim… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures

  25. arXiv:2405.11446  [pdf, other

    cs.CL cs.LG

    MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning

    Authors: Sanchit Sinha, Yuguang Yue, Victor Soto, Mayank Kulkarni, Jianhua Lu, Aidong Zhang

    Abstract: Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches esse… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: KDD 2024, 11 pages(9 main, 2 ref, 1 App) Openreview https://openreview.net/forum?id=JwecLNhWDy&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DKDD.org%2F2024%2FResearch_Track%2FAuthors%23your-submissions)

  26. arXiv:2405.08768  [pdf, other

    cs.CV cs.AI cs.LG

    EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training

    Authors: Yulin Wang, Yang Yue, Rui Lu, Yizeng Han, Shiji Song, Gao Huang

    Abstract: The superior performance of modern visual backbones usually comes with a costly training procedure. We contribute to this issue by generalizing the idea of curriculum learning beyond its original formulation, i.e., training models using easier-to-harder data. Specifically, we reformulate the training curriculum as a soft-selection function, which uncovers progressively more difficult patterns with… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Journal version of arXiv:2211.09703 (ICCV 2023). Code is available at: https://github.com/LeapLabTHU/EfficientTrain

  27. arXiv:2405.05554  [pdf, other

    hep-ex

    RELICS: a REactor neutrino LIquid xenon Coherent elastic Scattering experiment

    Authors: Chang Cai, Guocai Chen, Jiangyu Chen, Rundong Fang, Fei Gao, Xiaoran Guo, Jiheng Guo, Tingyi He, Chengjie Jia, Gaojun Jin, Yipin Jing, Gaojun Ju, Yang Lei, Jiayi Li, Kaihang Li, Meng Li, Minhua Li, Shengchao Li, Siyin Li, Tao Li, Qing Lin, Jiajun Liu, Minghao Liu, Sheng Lv, Guang Luo , et al. (24 additional authors not shown)

    Abstract: Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this… ▽ More

    Submitted 12 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  28. arXiv:2405.04332  [pdf, other

    cs.CR

    WALLETRADAR: Towards Automating the Detection of Vulnerabilities in Browser-based Cryptocurrency Wallets

    Authors: Pengcheng Xia, Yanhui Guo, Zhaowen Lin, Jun Wu, Pengbo Duan, Ningyu He, Kailong Wang, Tianming Liu, Yinliang Yue, Guoai Xu, Haoyu Wang

    Abstract: Cryptocurrency wallets, acting as fundamental infrastructure to the blockchain ecosystem, have seen significant user growth, particularly among browser-based wallets (i.e., browser extensions). However, this expansion accompanies security challenges, making these wallets prime targets for malicious activities. Despite a substantial user base, there is not only a significant gap in comprehensive se… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Just accepted by the Automated Software Engineering Journal

  29. arXiv:2404.18279  [pdf, other

    cs.CV

    Out-of-distribution Detection in Medical Image Analysis: A survey

    Authors: Zesheng Hong, Yubiao Yue, Yubin Chen, Lele Cong, Huanjie Lin, Yuanmei Luo, Mini Han Wang, Weidong Wang, Jialong Xu, Xiaoqi Yang, Hechang Chen, Zhenzhang Li, Sihong Xie

    Abstract: Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques in these years. Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data. However, it is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in dee… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: 23 pages, 3 figures

  30. Understanding Hyperbolic Metric Learning through Hard Negative Sampling

    Authors: Yun Yue, Fangzhou Lin, Guanyi Mou, Ziming Zhang

    Abstract: In recent years, there has been a growing trend of incorporating hyperbolic geometry methods into computer vision. While these methods have achieved state-of-the-art performance on various metric learning tasks using hyperbolic distance measurements, the underlying theoretical analysis supporting this superior performance remains under-exploited. In this study, we investigate the effects of integr… ▽ More

    Submitted 2 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: published in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024

  31. arXiv:2404.13009  [pdf, other

    math.OC

    Online Policy Optimization in Unknown Nonlinear Systems

    Authors: Yiheng Lin, James A. Preiss, Fengze Xie, Emile Anand, Soon-Jo Chung, Yisong Yue, Adam Wierman

    Abstract: We study online policy optimization in nonlinear time-varying dynamical systems where the true dynamical models are unknown to the controller. This problem is challenging because, unlike in linear systems, the controller cannot obtain globally accurate estimations of the ground-truth dynamics using local exploration. We propose a meta-framework that combines a general online policy optimization al… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  32. arXiv:2404.10342  [pdf, other

    cs.CV cs.MM

    Referring Flexible Image Restoration

    Authors: Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue, Ka Lok Man, Jeremy Smith, Eng Gee Lim, Weiping Ding, Yutao Yue

    Abstract: In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 15 pages, 19 figures

  33. arXiv:2404.07790  [pdf, other

    cs.CV

    VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing

    Authors: Meng Yu, Te Cui, Haoyang Lu, Yufeng Yue

    Abstract: Image dehazing poses significant challenges in environmental perception. Recent research mainly focus on deep learning-based methods with single modality, while they may result in severe information loss especially in dense-haze scenarios. The infrared image exhibits robustness to the haze, however, existing methods have primarily treated the infrared modality as auxiliary information, failing to… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  34. arXiv:2404.07770  [pdf, other

    cs.CV

    Joint Conditional Diffusion Model for Image Restoration with Mixed Degradations

    Authors: Yufeng Yue, Meng Yu, Luojie Yang, Yi Yang

    Abstract: Image restoration is rather challenging in adverse weather conditions, especially when multiple degradations occur simultaneously. Blind image decomposition was proposed to tackle this issue, however, its effectiveness heavily relies on the accurate estimation of each component. Although diffusion-based models exhibit strong generative abilities in image restoration tasks, they may generate irrele… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  35. arXiv:2404.05187  [pdf, other

    cs.CV cs.GR cs.RO

    LGSDF: Continual Global Learning of Signed Distance Fields Aided by Local Updating

    Authors: Yufeng Yue, Yinan Deng, Jiahui Wang, Yi Yang

    Abstract: Implicit reconstruction of ESDF (Euclidean Signed Distance Field) involves training a neural network to regress the signed distance from any point to the nearest obstacle, which has the advantages of lightweight storage and continuous querying. However, existing algorithms usually rely on conflicting raw observations as training data, resulting in poor map performance. In this paper, we propose LG… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  36. arXiv:2403.19004  [pdf, other

    math.NA

    Discrete Poincaré inequality and Discrete Trace inequality in Piece-wise Polynomial Hybridizable Spaces

    Authors: Yukun Yue

    Abstract: In this paper, we establish discrete versions of the Poincaré and trace inequalities for hybridizable finite element spaces. These spaces are made of piecewise polynomial functions defined both within the interiors of elements and across all faces in a mesh's skeleton, serving as the basis for both the hybridizable discontinuous Galerkin (HDG) and hybrid high-order (HHO) methods. Additionally, we… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  37. arXiv:2403.15658  [pdf, other

    cs.RO

    Data-Driven Predictive Control for Robust Exoskeleton Locomotion

    Authors: Kejun Li, Jeeseop Kim, Xiaobin Xiong, Kaveh Akbari Hamed, Yisong Yue, Aaron D. Ames

    Abstract: Exoskeleton locomotion must be robust while being adaptive to different users with and without payloads. To address these challenges, this work introduces a data-driven predictive control (DDPC) framework to synthesize walking gaits for lower-body exoskeletons, employing Hankel matrices and a state transition matrix for its data-driven model. The proposed approach leverages DDPC through a multi-la… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  38. arXiv:2403.12686  [pdf, other

    cs.CV cs.MM cs.RO

    WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar

    Authors: Runwei Guan, Liye Jia, Fengyufan Yang, Shanliang Yao, Erick Purwanto, Xiaohui Zhu, Eng Gee Lim, Jeremy Smith, Ka Lok Man, Xuming Hu, Yutao Yue

    Abstract: The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the… ▽ More

    Submitted 4 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 10 pages, 10 figures

  39. arXiv:2403.09412  [pdf, other

    cs.CV cs.AI cs.RO

    OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments

    Authors: Yinan Deng, Jiahui Wang, Jingyu Zhao, Xinyu Tian, Guangyan Chen, Yi Yang, Yufeng Yue

    Abstract: Environment representations endowed with sophisticated semantics are pivotal for facilitating seamless interaction between robots and humans, enabling them to effectively carry out various tasks. Open-vocabulary maps, powered by Visual-Language models (VLMs), possess inherent advantages, including zero-shot learning and support for open-set classes. However, existing open-vocabulary maps are prima… ▽ More

    Submitted 28 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  40. arXiv:2403.03849  [pdf, other

    eess.IV cs.CV cs.LG

    MedMamba: Vision Mamba for Medical Image Classification

    Authors: Yubiao Yue, Zhenzhang Li

    Abstract: Since the era of deep learning, convolutional neural networks (CNNs) and vision transformers (ViTs) have been extensively studied and widely used in medical image classification tasks. Unfortunately, CNN's limitations in modeling long-range dependencies result in poor classification performances. In contrast, ViTs are hampered by the quadratic computational complexity of their self-attention mecha… ▽ More

    Submitted 10 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  41. arXiv:2403.01248  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code

    Authors: Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi

    Abstract: This paper introduces SceneCraft, a Large Language Model (LLM) Agent converting text descriptions into Blender-executable Python scripts which render complex scenes with up to a hundred 3D assets. This process requires complex spatial planning and arrangement. We tackle these challenges through a combination of advanced abstraction, strategic planning, and library learning. SceneCraft first models… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  42. arXiv:2402.18546  [pdf, other

    cs.LG

    Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces

    Authors: Geeling Chau, Yujin An, Ahamed Raffey Iqbal, Soon-Jo Chung, Yisong Yue, Sabera Talukder

    Abstract: A major goal in neuroscience is to discover neural data representations that generalize. This goal is challenged by variability along recording sessions (e.g. environment), subjects (e.g. varying neural structures), and sensors (e.g. sensor noise), among others. Recent work has begun to address generalization across sessions and subjects, but few study robustness to sensor failure which is highly… ▽ More

    Submitted 19 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 9 pages, 8 figures

  43. arXiv:2402.16412  [pdf, other

    cs.LG

    TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis

    Authors: Sabera Talukder, Yisong Yue, Georgia Gkioxari

    Abstract: The field of general time series analysis has recently begun to explore unified modeling, where a common architectural backbone can be retrained on a specific task for a specific dataset. In this work, we approach unification from a complementary vantage point: unification across tasks and domains. To this end, we explore the impact of discrete, learnt, time series data representations that enable… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  44. arXiv:2402.14285  [pdf, other

    cs.SD cs.LG eess.AS

    Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion

    Authors: Yujia Huang, Adishree Ghatare, Yuanzhe Liu, Ziniu Hu, Qinsheng Zhang, Chandramouli S Sastry, Siddharth Gururani, Sageev Oore, Yisong Yue

    Abstract: We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose \oursfull (\ours), a novel gui… ▽ More

    Submitted 2 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ICML 2024 (Oral)

  45. arXiv:2402.12065  [pdf, other

    cs.LG cs.AI cs.CL

    WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More

    Authors: Yuxuan Yue, Zhihang Yuan, Haojie Duanmu, Sifan Zhou, Jianlong Wu, Liqiang Nie

    Abstract: Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process. This paper addresses these challenges by focusing on the quantization of LLMs, a technique that reduces memory consumption by converting model parameters and activations into low-bit integers. We critically analyz… ▽ More

    Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Frist work to exclusively quantize weight and Key/Value cache for large language models

  46. arXiv:2402.10130  [pdf, other

    cs.LG cs.AI cs.CV

    Is Continual Learning Ready for Real-world Challenges?

    Authors: Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler

    Abstract: Despite continual learning's long and well-established academic history, its application in real-world scenarios remains rather limited. This paper contends that this gap is attributable to a misalignment between the actual challenges of continual learning and the evaluation protocols in use, rendering proposed solutions ineffective for addressing the complexities of real-world setups. We validate… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  47. arXiv:2402.01242  [pdf, other

    cs.LG

    Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness

    Authors: Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen

    Abstract: Graph Neural Networks (GNNs) excel in various graph learning tasks but face computational challenges when applied to large-scale graphs. A promising solution is to remove non-essential edges to reduce the computational overheads in GNN. Previous literature generally falls into two categories: topology-guided and semantic-guided. The former maintains certain graph topological properties yet often u… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  48. arXiv:2402.00578  [pdf, other

    astro-ph.HE

    Discovery and timing of pulsar J2016$+$3711 in supernova remnant CTB 87 with FAST

    Authors: Qian-Cheng Liu, Wen-Juan Zhong, Yang Chen, Pei Wang, Ping Zhou, You-Ling Yue, Di Li

    Abstract: We report on our discovery of the radio pulsar, PSR J2016$+$3711, in supernova remnant (SNR) CTB 87, with a $\sim10.8σ$ significance of pulses, which confirms the compact nature of the X-ray point source in CTB 87. It is the first pulsar discovered in SNRs using Five-hundred-meter Aperture Spherical radio Telescope (FAST). Its integrated radio pulse profile can be well described by a single compon… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 7 pages, 5 figures, accepted for publication in MNRAS

  49. arXiv:2401.11353  [pdf, other

    cs.LG

    Distributionally Robust Policy Evaluation under General Covariate Shift in Contextual Bandits

    Authors: Yihong Guo, Hao Liu, Yisong Yue, Anqi Liu

    Abstract: We introduce a distributionally robust approach that enhances the reliability of offline policy evaluation in contextual bandits under general covariate shifts. Our method aims to deliver robust policy evaluation results in the presence of discrepancies in both context and policy distribution between logging and target data. Central to our methodology is the application of robust regression, a dis… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  50. arXiv:2401.07950  [pdf, other

    cs.CL

    SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning

    Authors: Dan Zhang, Ziniu Hu, Sining Zhoubian, Zhengxiao Du, Kaiyu Yang, Zihan Wang, Yisong Yue, Yuxiao Dong, Jie Tang

    Abstract: Large Language Models (LLMs) have shown promise in assisting scientific discovery. However, such applications are currently limited by LLMs' deficiencies in understanding intricate scientific concepts, deriving symbolic equations, and solving advanced numerical calculations. To bridge these gaps, we introduce SciGLM, a suite of scientific language models able to conduct college-level scientific re… ▽ More

    Submitted 12 March, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: 21 pages