Skip to main content

Showing 1–50 of 135 results for author: Zhan, W

  1. arXiv:2407.01531  [pdf, other

    cs.RO cs.LG

    Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

    Authors: Yixiao Wang, Yifei Zhang, Mingxiao Huo, Ran Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding, Masayoshi Tomizuka

    Abstract: The increasing complexity of tasks in robotics demands efficient strategies for multitask and continual learning. Traditional models typically rely on a universal policy for all tasks, facing challenges such as high computational costs and catastrophic forgetting when learning new tasks. To address these issues, we introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP). B… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00898  [pdf, other

    cs.RO

    Residual-MPPI: Online Policy Customization for Continuous Control

    Authors: Pengcheng Wang, Chenran Li, Catherine Weaver, Kenta Kawamoto, Masayoshi Tomizuka, Chen Tang, Wei Zhan

    Abstract: Policies learned through Reinforcement Learning (RL) and Imitation Learning (IL) have demonstrated significant potential in achieving advanced performance in continuous control tasks. However, in real-world environments, it is often necessary to further customize a trained policy when there are additional requirements that were unforeseen during the original training phase. It is possible to fine-… ▽ More

    Submitted 3 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

  3. arXiv:2406.16258  [pdf, other

    cs.RO cs.AI cs.LG

    MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention

    Authors: Yuxin Chen, Chen Tang, Chenran Li, Ran Tian, Peter Stone, Masayoshi Tomizuka, Wei Zhan

    Abstract: Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thu… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    ACM Class: I.2.6; I.2.9

  4. arXiv:2405.20323  [pdf, other

    cs.CV cs.AI

    $\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

    Authors: Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

    Abstract: Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving. Despite the efficacy of Neural Radiance Fields (NeRF) for driving scenes, 3D Gaussian Splatting (3DGS) emerges as a promising direction due to its faster speed and more explicit representation. However, most existing street 3DGS methods require tracked 3D vehicle b… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Code is available at: https://github.com/nnanhuang/S3Gaussian/

  5. arXiv:2405.01333  [pdf, other

    cs.RO cs.CV

    NeRF in Robotics: A Survey

    Authors: Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang

    Abstract: Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as sim… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 21 pages, 19 figures

  6. arXiv:2404.17454  [pdf, other

    cs.LG cs.AI q-bio.QM

    Domain Adaptive and Fine-grained Anomaly Detection for Single-cell Sequencing Data and Beyond

    Authors: Kaichen Xu, Yueyang Ding, Suyang Hou, Weiqiang Zhan, Nisang Chen, Jun Wang, Xiaobo Sun

    Abstract: Fined-grained anomalous cell detection from affected tissues is critical for clinical diagnosis and pathological research. Single-cell sequencing data provide unprecedented opportunities for this task. However, current anomaly detection methods struggle to handle domain shifts prevalent in multi-sample and multi-domain single-cell sequencing data, leading to suboptimal performance. Moreover, these… ▽ More

    Submitted 29 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 17 pages, 2 figures. Accepted by IJCAI 2024

  7. arXiv:2404.16767  [pdf, other

    cs.LG cs.CL cs.CV

    REBEL: Reinforcement Learning via Regressing Relative Rewards

    Authors: Zhaolin Gao, Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun

    Abstract: While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO requires multiple heuristics to enable stable convergence (e.g. value networks, clipping), and is notorious for its sensitivity to the precise impleme… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: New experimental results on general chat

  8. arXiv:2404.15141  [pdf, other

    cs.CV cs.AI

    CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

    Authors: Mingbao Lin, Zhihang Lin, Wengyi Zhan, Liujuan Cao, Rongrong Ji

    Abstract: Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i.e., diffusion extrapolation, significantly improves diffusion adaptability. We propose tuning-free CutDiffusion, aimed at simplifying and accelerating the diffusion extrapolation process, making it more affordable and improving performance. CutDiffusion abides by the existing patch-wise extrapol… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  9. arXiv:2404.08495  [pdf, other

    cs.LG cs.AI cs.CL

    Dataset Reset Policy Optimization for RLHF

    Authors: Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

    Abstract: Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a reward model from an offline preference dataset followed by running online RL to optimize the learned reward model. In this work, leveraging the idea of r… ▽ More

    Submitted 16 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 28 pages, 6 tables, 3 Figures, 3 Algorithms

  10. arXiv:2404.04772  [pdf, other

    cs.RO

    Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning

    Authors: Zheng Wu, Yichuan Li, Wei Zhan, Changliu Liu, Yun-Hui Liu, Masayoshi Tomizuka

    Abstract: The development of robotic systems for palletization in logistics scenarios is of paramount importance, addressing critical efficiency and precision demands in supply chain management. This paper investigates the application of Reinforcement Learning (RL) in enhancing task planning for such robotic systems. Confronted with the substantial challenge of a vast action space, which is a significant im… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

  11. arXiv:2403.08125  [pdf, other

    cs.CV

    Q-SLAM: Quadric Representations for Monocular SLAM

    Authors: Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan

    Abstract: Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise, yet these methods typically focus on novel view synthesis rather than precise 3D geometry modeling. This focus results in a significant disconnect between NeRF applications, i.e., novel-view synthesis and the requirement… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  12. arXiv:2403.07470  [pdf, other

    cs.RO cs.PL

    DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models

    Authors: Yuanfei Lin, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Matthias Althoff

    Abstract: Motion planners are essential for the safe operation of automated vehicles across various scenarios. However, no motion planning algorithm has achieved perfection in the literature, and improving its performance is often time-consuming and labor-intensive. To tackle the aforementioned issues, we present DrPlanner, the first framework designed to automatically diagnose and repair motion planners us… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: @2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  13. arXiv:2403.06086  [pdf, other

    cs.AI cs.RO

    Towards Generalizable and Interpretable Motion Prediction: A Deep Variational Bayes Approach

    Authors: Juanwu Lu, Wei Zhan, Masayoshi Tomizuka, Yeping Hu

    Abstract: Estimating the potential behavior of the surrounding human-driven vehicles is crucial for the safety of autonomous vehicles in a mixed traffic flow. Recent state-of-the-art achieved accurate prediction using deep neural networks. However, these end-to-end models are usually black boxes with weak interpretability and generalizability. This paper proposes the Goal-based Neural Variational Agent (GNe… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted at AISTATS 2024

  14. arXiv:2402.16836  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

    Authors: Dingkun Guo, Yuqi Xiang, Shuqi Zhao, Xinghao Zhu, Masayoshi Tomizuka, Mingyu Ding, Wei Zhan

    Abstract: Robotic grasping is a fundamental aspect of robot functionality, defining how robots interact with objects. Despite substantial progress, its generalizability to counter-intuitive or long-tailed scenarios, such as objects with uncommon materials or shapes, remains a challenge. In contrast, humans can easily apply their intuitive physics to grasp skillfully and change grasps efficiently, even for o… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  15. arXiv:2402.15583  [pdf, other

    cs.CV cs.LG

    Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving

    Authors: Yichen Xie, Hongge Chen, Gregory P. Meyer, Yong Jae Lee, Eric M. Wolff, Masayoshi Tomizuka, Wei Zhan, Yuning Chai, Xin Huang

    Abstract: Due to the lack of depth cues in images, multi-frame inputs are important for the success of vision-based perception, prediction, and planning in autonomous driving. Observations from different angles enable the recovery of 3D object states from 2D image inputs if we can identify the same instance in different input frames. However, the dynamic nature of autonomous driving scenes leads to signific… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  16. arXiv:2402.14194  [pdf, other

    cs.LG cs.RO

    BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay

    Authors: Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan

    Abstract: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions. In many robotic tasks, such as autonomous racing, imitated policies must model complex environment dynamics and human decision-making. Sequence modeling is highly effective in capturing intricate patterns of motion sequences but struggles to adapt to new environments or distribution shifts that… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Preprint

  17. arXiv:2402.08931  [pdf, other

    cs.CV

    Depth-aware Volume Attention for Texture-less Stereo Matching

    Authors: Tong Zhao, Mingyu Ding, Wei Zhan, Masayoshi Tomizuka, Yintao Wei

    Abstract: Stereo matching plays a crucial role in 3D perception and scenario understanding. Despite the proliferation of promising methods, addressing texture-less and texture-repetitive conditions remains challenging due to the insufficient availability of rich geometric and semantic information. In this paper, we propose a lightweight volume refinement scheme to tackle the texture deterioration in practic… ▽ More

    Submitted 26 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: 10 pages, 6 figures

  18. arXiv:2401.15315  [pdf, other

    cs.RO

    Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

    Authors: Zhiyu Huang, Chen Tang, Chen Lv, Masayoshi Tomizuka, Wei Zhan

    Abstract: Effective decision-making in autonomous driving relies on accurate inference of other traffic agents' future behaviors. To achieve this, we propose an online belief-update-based behavior prediction model and an efficient planner for Partially Observable Markov Decision Processes (POMDPs). We develop a Transformer-based prediction model, enhanced with a recurrent neural memory model, to dynamically… ▽ More

    Submitted 17 June, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

    Comments: IEEE Robotics and Automation Letters

  19. arXiv:2401.00391  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Controllable Adversaries

    Authors: Wei-Jer Chang, Francesco Pittaluga, Masayoshi Tomizuka, Wei Zhan, Manmohan Chandraker

    Abstract: Evaluating the performance of autonomous vehicle planning algorithms necessitates simulating long-tail safety-critical traffic scenarios. However, traditional methods for generating such scenarios often fall short in terms of controllability and realism and neglect the dynamics of agent interactions. To mitigate these limitations, we introduce SAFE-SIM, a novel diffusion-based controllable closed-… ▽ More

    Submitted 15 June, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: Under Review

    ACM Class: I.2.9; I.2.6

  20. arXiv:2312.15380  [pdf, other

    cs.NI eess.SP

    Battery-Care Resource Allocation and Task Offloading in Multi-Agent Post-Disaster MEC Environment

    Authors: Yiwei Tang, Hualong Huang, Wenhan Zhan, Geyong Min, Zhekai Duan, Yuchuan Lei

    Abstract: Being an up-and-coming application scenario of mobile edge computing (MEC), the post-disaster rescue suffers multitudinous computing-intensive tasks but unstably guaranteed network connectivity. In rescue environments, quality of service (QoS), such as task execution delay, energy consumption and battery state of health (SoH), is of significant meaning. This paper studies a multi-user post-disaste… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: accepted by wcnc2024

  21. arXiv:2312.05134  [pdf, other

    cs.LG stat.ML

    Optimal Multi-Distribution Learning

    Authors: Zihan Zhang, Wenhao Zhan, Yuxin Chen, Simon S. Du, Jason D. Lee

    Abstract: Multi-distribution learning (MDL), which seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions, has emerged as a unified framework in response to the evolving demand for robustness, fairness, multi-group collaboration, etc. Achieving data-efficient MDL necessitates adaptive sampling, also called on-demand sampling, throughout the learning process.… ▽ More

    Submitted 23 May, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

  22. arXiv:2312.01662  [pdf

    cond-mat.mes-hall cs.LG eess.IV eess.SY

    Universal Deoxidation of Semiconductor Substrates Assisted by Machine-Learning and Real-Time-Feedback-Control

    Authors: Chao Shen, Wenkang Zhan, Jian Tang, Zhaofeng Wu, Bo Xu, Chao Zhao, Zhanguo Wang

    Abstract: Thin film deposition is an essential step in the semiconductor process. During preparation or loading, the substrate is exposed to the air unavoidably, which has motivated studies of the process control to remove the surface oxide before thin film deposition. Optimizing the deoxidation process in molecular beam epitaxy (MBE) for a random substrate is a multidimensional challenge and sometimes cont… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 5 figures

  23. arXiv:2311.11965  [pdf, other

    cs.LG stat.ML

    Provably Efficient CVaR RL in Low-rank MDPs

    Authors: Yulai Zhao, Wenhao Zhan, Xiaoyan Hu, Ho-fung Leung, Farzan Farnia, Wen Sun, Jason D. Lee

    Abstract: We study risk-sensitive Reinforcement Learning (RL), where we aim to maximize the Conditional Value at Risk (CVaR) with a fixed risk tolerance $τ$. Prior theoretical work studying risk-sensitive RL focuses on the tabular Markov Decision Processes (MDPs) setting. To extend CVaR RL to settings where state space is large, function approximation must be deployed. We study CVaR RL in low-rank MDPs with… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: The first three authors contribute equally and are ordered randomly

  24. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  25. arXiv:2310.07218  [pdf, other

    cs.MA cs.AI

    Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization

    Authors: Yuxin Chen, Chen Tang, Ran Tian, Chenran Li, Jinning Li, Masayoshi Tomizuka, Wei Zhan

    Abstract: Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL). The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario. A quantitative examination of this relationship sheds light on effectively training agents for diverse scenarios. In this study, we present the Level of Influence (LoI), a metric quantifyi… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 12 pages, 6 figures

    ACM Class: I.2.6

  26. arXiv:2310.05199  [pdf, other

    cs.CL

    Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback

    Authors: Wei Shen, Rui Zheng, Wenyu Zhan, Jun Zhao, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Reinforcement learning from human feedback serves as a crucial bridge, aligning large language models with human and societal values. This alignment requires a vast corpus of human feedback to learn a reward model, which is subsequently used to finetune language models. However, we have identified that the reward model often finds shortcuts to bypass its intended objectives, misleadingly assuming… ▽ More

    Submitted 29 November, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 findings, Length Bias in RLHF, Mitigate bias in reward modeling

  27. arXiv:2310.03026  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

    Authors: Hao Sha, Yao Mu, Yuxuan Jiang, Li Chen, Chenfeng Xu, Ping Luo, Shengbo Eben Li, Masayoshi Tomizuka, Wei Zhan, Mingyu Ding

    Abstract: Existing learning-based autonomous driving (AD) systems face challenges in comprehending high-level information, generalizing to rare events, and providing interpretability. To address these problems, this work employs Large Language Models (LLMs) as a decision-making component for complex AD scenarios that require human commonsense understanding. We devise cognitive pathways to enable comprehensi… ▽ More

    Submitted 13 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  28. arXiv:2310.03023  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Human-oriented Representation Learning for Robotic Manipulation

    Authors: Mingxiao Huo, Mingyu Ding, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan

    Abstract: Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks. We advocate that such a representation automatically arises from simultaneously learning about multiple simple perceptual skills that are critical for everyday scenarios (e.g., hand detection, state estimate, etc.) and is better suited fo… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  29. arXiv:2310.02648  [pdf, other

    cs.RO

    Long-Term Dynamic Window Approach for Kinodynamic Local Planning in Static and Crowd Environments

    Authors: Zhiqiang Jian, Songyi Zhang, Lingfeng Sun, Wei Zhan, Nanning Zheng, Masayoshi Tomizuka

    Abstract: Local planning for a differential wheeled robot is designed to generate kinodynamic feasible actions that guide the robot to a goal position along the navigation path while avoiding obstacles. Reactive, predictive, and learning-based methods are widely used in local planning. However, few of them can fit static and crowd environments while satisfying kinodynamic constraints simultaneously. To solv… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 9 pages, 7 figures

    Journal ref: 2023 IEEE RA-L

  30. arXiv:2310.02625  [pdf, other

    cs.RO

    Adaptive Spatio-Temporal Voxels Based Trajectory Planning for Autonomous Driving in Highway Traffic Flow

    Authors: Zhiqiang Jian, Songyi Zhang, Lingfeng Sun, Wei Zhan, Masayoshi Tomizuka, Nanning Zheng

    Abstract: Trajectory planning is crucial for the safe driving of autonomous vehicles in highway traffic flow. Currently, some advanced trajectory planning methods utilize spatio-temporal voxels to construct feasible regions and then convert trajectory planning into optimization problem solving based on the feasible regions. However, these feasible region construction methods cannot adapt to the changes in d… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 8 pages, 5 figures

    Journal ref: IEEE ITSC 2023

  31. arXiv:2310.02262  [pdf, other

    cs.CV cs.GR cs.RO

    RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving

    Authors: Tong Zhao, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Yintao Wei

    Abstract: This paper addresses the growing demands for safety and comfort in intelligent robot systems, particularly autonomous vehicles, where road conditions play a pivotal role in overall driving performance. For example, reconstructing road surfaces helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction Data… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  32. arXiv:2309.17342  [pdf, other

    cs.CV cs.LG

    Towards Free Data Selection with General-Purpose Models

    Authors: Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan

    Abstract: A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. In this paper, we challenge this status quo by designing a dist… ▽ More

    Submitted 14 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: accepted by NeurIPS 2023

  33. arXiv:2309.10121  [pdf, other

    cs.CV

    Pre-training on Synthetic Driving Data for Trajectory Prediction

    Authors: Yiheng Li, Seth Z. Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan

    Abstract: Accumulating substantial volumes of real-world driving data proves pivotal in the realm of trajectory forecasting for autonomous driving. Given the heavy reliance of current trajectory forecasting models on data-driven methodologies, we aim to tackle the challenge of learning general trajectory forecasting representations under limited data availability. We propose to augment both HD maps and traj… ▽ More

    Submitted 19 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

  34. arXiv:2309.09408  [pdf, other

    cs.RO cs.LG

    Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

    Authors: Jinning Li, Xinyi Liu, Banghua Zhu, Jiantao Jiao, Masayoshi Tomizuka, Chen Tang, Wei Zhan

    Abstract: Safe Reinforcement Learning (RL) aims to find a policy that achieves high rewards while satisfying cost constraints. When learning from scratch, safe RL agents tend to be overly conservative, which impedes exploration and restrains the overall performance. In many realistic tasks, e.g. autonomous driving, large-scale expert demonstration data are available. We argue that extracting expert policy f… ▽ More

    Submitted 12 October, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

  35. arXiv:2308.04383  [pdf, other

    cs.CV

    DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds

    Authors: Chensheng Peng, Guangming Wang, Xian Wan Lo, Xinrui Wu, Chenfeng Xu, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang

    Abstract: Point clouds are naturally sparse, while image pixels are dense. The inconsistency limits feature fusion from both modalities for point-wise scene flow estimation. Previous methods rarely predict scene flow from the entire point clouds of the scene with one-time inference due to the memory inefficiency and heavy overhead from distance calculation and sorting involved in commonly used farthest poin… ▽ More

    Submitted 9 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023. Codes will be released at https://github.com/IRMVLab/DELFlow

  36. arXiv:2307.14620  [pdf, other

    cs.CV

    NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

    Authors: Chenfeng Xu, Bichen Wu, Ji Hou, Sam Tsai, Ruilong Li, Jialiang Wang, Wei Zhan, Zijian He, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka

    Abstract: We present NeRF-Det, a novel method for indoor 3D detection with posed RGB images as input. Unlike existing indoor 3D detection methods that struggle to model scene geometry, our method makes novel use of NeRF in an end-to-end manner to explicitly estimate 3D geometry, thereby improving 3D detection performance. Specifically, to avoid the significant extra latency associated with per-scene optimiz… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  37. arXiv:2307.11083  [pdf, ps, other

    quant-ph cs.CC

    Quantum Logspace Computations are Verifiable

    Authors: Uma Girish, Ran Raz, Wei Zhan

    Abstract: In this note, we observe that quantum logspace computations are verifiable by classical logspace algorithms, with unconditional security. More precisely, every language in BQL has an (information-theoretically secure) streaming proof with a quantum logspace prover and a classical logspace verifier. The prover provides a polynomial-length proof that is streamed to the verifier. The verifier has a r… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  38. arXiv:2306.17165  [pdf, other

    cs.CV cs.AI cs.LG

    An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training

    Authors: Zitian Chen, Mingyu Ding, Yikang Shen, Wei Zhan, Masayoshi Tomizuka, Erik Learned-Miller, Chuang Gan

    Abstract: We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently. Despite considerable progress in multi-task learning, most efforts focus on learning from multi-label data: a single image set with multiple task labels. Such multi-label data sets are rare, small, and expensive. We say heterogeneous to refer to image sets with different task labels,… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  39. arXiv:2306.15817  [pdf, ps, other

    cs.CC

    Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions

    Authors: Huacheng Yu, Wei Zhan

    Abstract: We prove the first polynomial separation between randomized and deterministic time-space tradeoffs of multi-output functions. In particular, we present a total function that on the input of $n$ elements in $[n]$, outputs $O(n)$ elements, such that: (1) There exists a randomized oblivious algorithm with space $O(\log n)$, time $O(n\log n)$ and one-way access to randomness, that computes the functio… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 15 pages

  40. arXiv:2306.12898  [pdf

    cond-mat.mes-hall cs.LG eess.IV

    Machine-Learning-Assisted and Real-Time-Feedback-Controlled Growth of InAs/GaAs Quantum Dots

    Authors: Chao Shen, Wenkang Zhan, Kaiyao Xin, Manyang Li, Zhenyu Sun, Hui Cong, Chi Xu, Jian Tang, Zhaofeng Wu, Bo Xu, Zhongming Wei, Chunlai Xue, Chao Zhao, Zhanguo Wang

    Abstract: Self-assembled InAs/GaAs quantum dots (QDs) have properties highly valuable for developing various optoelectronic devices such as QD lasers and single photon sources. The applications strongly rely on the density and quality of these dots, which has motivated studies of the growth process control to realize high-quality epi-wafers and devices. Establishing the process parameters in molecular beam… ▽ More

    Submitted 11 October, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: 5 figures

  41. arXiv:2306.09526  [pdf, other

    cs.LG cs.AI

    Residual Q-Learning: Offline and Online Policy Customization without Value

    Authors: Chenran Li, Chen Tang, Haruki Nishimura, Jean Mercat, Masayoshi Tomizuka, Wei Zhan

    Abstract: Imitation Learning (IL) is a widely used framework for learning imitative behavior from demonstrations. It is especially appealing for solving complex real-world tasks where handcrafting reward function is difficult, or when the goal is to mimic human expert behavior. However, the learned imitative policy can only follow the behavior in the demonstration. When applying the imitative policy, we may… ▽ More

    Submitted 14 January, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted by 37th Conference on Neural Information Processing Systems (NeurIPS 2023). The first two authors contributed equally

  42. arXiv:2306.08388  [pdf, other

    cs.LG cs.AI

    Skill-Critic: Refining Learned Skills for Reinforcement Learning

    Authors: Ce Hao, Catherine Weaver, Chen Tang, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan

    Abstract: Hierarchical reinforcement learning (RL) can accelerate long-horizon decision-making by temporally abstracting a policy into multiple levels. Promising results in sparse reward environments have been seen with skills, i.e. sequences of primitive actions. Typically, a skill latent space and policy are discovered from offline data, but the resulting low-level policy can be unreliable due to low-cove… ▽ More

    Submitted 15 June, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Preprint

  43. arXiv:2306.04954  [pdf, other

    cs.CL

    RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction

    Authors: Jun Zhao, Wenyu Zhan, Xin Zhao, Qi Zhang, Tao Gui, Zhongyu Wei, Junzhe Wang, Minlong Peng, Mingming Sun

    Abstract: Semantic matching is a mainstream paradigm of zero-shot relation extraction, which matches a given input with a corresponding label description. The entities in the input should exactly match their hypernyms in the description, while the irrelevant contexts should be ignored when matching. However, general matching methods lack explicit modeling of the above matching pattern. In this work, we prop… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL2023

  44. arXiv:2306.04950  [pdf, other

    cs.CL

    Open Set Relation Extraction via Unknown-Aware Training

    Authors: Jun Zhao, Xin Zhao, Wenyu Zhan, Qi Zhang, Tao Gui, Zhongyu Wei, Yunwen Chen, Xiang Gao, Xuanjing Huang

    Abstract: The existing supervised relation extraction methods have achieved impressive performance in a closed-set setting, where the relations during both training and testing remain the same. In a more realistic open-set setting, unknown relations may appear in the test set. Due to the lack of supervision signals from unknown relations, a well-performing closed-set relation extractor can still confidently… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL2023

  45. arXiv:2306.00265  [pdf, other

    cs.LG cs.AI cs.CV eess.IV stat.ML

    Doubly Robust Self-Training

    Authors: Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao

    Abstract: Self-training is an important technique for solving semi-supervised learning problems. It leverages unlabeled data by generating pseudo-labels and combining them with a limited labeled dataset for training. The effectiveness of self-training heavily relies on the accuracy of these pseudo-labels. In this paper, we introduce doubly robust self-training, a novel semi-supervised algorithm that provabl… ▽ More

    Submitted 2 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  46. arXiv:2305.18505  [pdf, ps, other

    cs.LG cs.AI math.ST stat.ML

    Provable Reward-Agnostic Preference-Based Reinforcement Learning

    Authors: Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee

    Abstract: Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories, rather than explicit reward signals. While PbRL has demonstrated practical success in fine-tuning language models, existing theoretical work focuses on regret minimization and fails to capture most of the practical frameworks. In t… ▽ More

    Submitted 17 April, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: ICLR 2024 Spotlight

  47. arXiv:2305.14816  [pdf, ps, other

    cs.LG math.ST stat.ML

    Provable Offline Preference-Based Reinforcement Learning

    Authors: Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun

    Abstract: In this paper, we investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback where feedback is available in the form of preference between trajectory pairs rather than explicit rewards. Our proposed algorithm consists of two main steps: (1) estimate the implicit reward using Maximum Likelihood Estimation (MLE) with general function approximation from offl… ▽ More

    Submitted 29 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: The first two authors contribute equally

  48. arXiv:2305.10282  [pdf, ps, other

    cs.LG cs.IT math.ST stat.ML

    Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning

    Authors: Gen Li, Wenhao Zhan, Jason D. Lee, Yuejie Chi, Yuxin Chen

    Abstract: This paper studies tabular reinforcement learning (RL) in the hybrid setting, which assumes access to both an offline dataset and online interactions with the unknown environment. A central question boils down to how to efficiently utilize online data collection to strengthen and complement the offline dataset and enable effective policy fine-tuning. Leveraging recent advances in reward-agnostic e… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  49. arXiv:2305.07740  [pdf, other

    cs.RO eess.SY

    Double-Iterative Gaussian Process Regression for Modeling Error Compensation in Autonomous Racing

    Authors: Shaoshu Su, Ce Hao, Catherine Weaver, Chen Tang, Wei Zhan, Masayoshi Tomizuka

    Abstract: Autonomous racing control is a challenging research problem as vehicles are pushed to their limits of handling to achieve an optimal lap time; therefore, vehicles exhibit highly nonlinear and complex dynamics. Difficult-to-model effects, such as drifting, aerodynamics, chassis weight transfer, and suspension can lead to infeasible and suboptimal trajectories. While offline planning allows optimizi… ▽ More

    Submitted 26 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 8 Pages, 6 Figures, Accepted by IFAC 2023 (The 22nd World Congress of the International Federation of Automatic Control)

  50. arXiv:2304.14340  [pdf, other

    cs.CV

    SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

    Authors: Yichen Xie, Chenfeng Xu, Marie-Julie Rakotosaona, Patrick Rim, Federico Tombari, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan

    Abstract: By identifying four important components of existing LiDAR-camera 3D object detection methods (LiDAR and camera candidates, transformation, and fusion outputs), we observe that all existing methods either find dense candidates or yield dense representations of scenes. However, given that objects occupy only a small part of a scene, finding dense candidates and generating dense representations is n… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.