subscribe to arXiv mailings

UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking

Authors: Lijun Zhou, Tao Tang, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Wenbo Hou, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

Abstract: 3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises… ▽ More 3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises owing to various factors during motion observation by cameras, especially occlusions and the small size of target objects, resulting in an inaccurate estimation of the object's position, label, and identity. To this end, we propose an Uncertainty-Aware 3D MOT framework, UA-Track, which tackles the uncertainty problem from multiple aspects. Specifically, we first introduce an Uncertainty-aware Probabilistic Decoder to capture the uncertainty in object prediction with probabilistic attention. Secondly, we propose an Uncertainty-guided Query Denoising strategy to further enhance the training process. We also utilize Uncertainty-reduced Query Initialization, which leverages predicted 2D object location and depth information to reduce query uncertainty. As a result, our UA-Track achieves state-of-the-art performance on the nuScenes benchmark, i.e., 66.3% AMOTA on the test split, surpassing the previous best end-to-end solution by a significant margin of 8.9% AMOTA. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.01587 [pdf, other]

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Authors: Yupeng Zheng, Zebin Xing, Qichao Zhang, Bu Jin, Pengfei Li, Yuhang Zheng, Zhongpu Xia, Kun Zhan, Xianpeng Lang, Yaran Chen, Dongbin Zhao

Abstract: Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we… ▽ More Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we propose PlanAgent, the first mid-to-mid planning system based on a Multi-modal Large Language Model (MLLM). MLLM is used as a cognitive agent to introduce human-like knowledge, interpretability, and common-sense reasoning into the closed-loop planning. Specifically, PlanAgent leverages the power of MLLM through three core modules. First, an Environment Transformation module constructs a Bird's Eye View (BEV) map and a lane-graph-based textual description from the environment as inputs. Second, a Reasoning Engine module introduces a hierarchical chain-of-thought from scene understanding to lateral and longitudinal motion instructions, culminating in planner code generation. Last, a Reflection module is integrated to simulate and evaluate the generated planner for reducing MLLM's uncertainty. PlanAgent is endowed with the common-sense reasoning and generalization capability of MLLM, which empowers it to effectively tackle both common and complex long-tailed scenarios. Our proposed PlanAgent is evaluated on the large-scale and challenging nuPlan benchmarks. A comprehensive set of experiments convincingly demonstrates that PlanAgent outperforms the existing state-of-the-art in the closed-loop motion planning task. Codes will be soon released. △ Less

Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2406.01349 [pdf, other]

Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

Abstract: Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and tempora… ▽ More Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and temporal inconsistencies are not negligible. To this end, we propose Delphi, a novel diffusion-based long video generation method with a shared noise modeling mechanism across the multi-views to increase spatial consistency, and a feature-aligned module to achieves both precise controllability and temporal consistency. Our method can generate up to 40 frames of video without loss of consistency which is about 5 times longer compared with state-of-the-art methods. Instead of randomly generating new data, we further design a sampling policy to let Delphi generate new data that are similar to those failure cases to improve the sample efficiency. This is achieved by building a failure-case driven framework with the help of pre-trained visual language models. Our extensive experiment demonstrates that our Delphi generates a higher quality of long videos surpassing previous state-of-the-art methods. Consequentially, with only generating 4% of the training dataset size, our framework is able to go beyond perception and prediction tasks, for the first time to the best of our knowledge, boost the planning performance of the end-to-end autonomous driving model by a margin of 25%. △ Less

Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: Project Page: https://westlake-autolab.github.io/delphi.github.io/, 8 figures

arXiv:2405.13651 [pdf]

ConcertoRL: An Innovative Time-Interleaved Reinforcement Learning Approach for Enhanced Control in Direct-Drive Tandem-Wing Vehicles

Authors: Minghao Zhang, Bifeng Song, Changhao Chen, Xinyu Lang

Abstract: In control problems for insect-scale direct-drive experimental platforms under tandem wing influence, the primary challenge facing existing reinforcement learning models is their limited safety in the exploration process and the stability of the continuous training process. We introduce the ConcertoRL algorithm to enhance control precision and stabilize the online training process, which consists… ▽ More In control problems for insect-scale direct-drive experimental platforms under tandem wing influence, the primary challenge facing existing reinforcement learning models is their limited safety in the exploration process and the stability of the continuous training process. We introduce the ConcertoRL algorithm to enhance control precision and stabilize the online training process, which consists of two main innovations: a time-interleaved mechanism to interweave classical controllers with reinforcement learning-based controllers aiming to improve control precision in the initial stages, a policy composer organizes the experience gained from previous learning to ensure the stability of the online training process. This paper conducts a series of experiments. First, experiments incorporating the time-interleaved mechanism demonstrate a substantial performance boost of approximately 70% over scenarios without reinforcement learning enhancements and a 50% increase in efficiency compared to reference controllers with doubled control frequencies. These results highlight the algorithm's ability to create a synergistic effect that exceeds the sum of its parts. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 48 pages, 35 figures

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2405.10874 [pdf, other]

Square-Root Inverse Filter-based GNSS-Visual-Inertial Navigation

Authors: Jun Hu, Xiaoming Lang, Feng Zhang, Yinian Mao, Guoquan Huang

Abstract: While Global Navigation Satellite System (GNSS) is often used to provide global positioning if available, its intermittency and/or inaccuracy calls for fusion with other sensors. In this paper, we develop a novel GNSS-Visual-Inertial Navigation System (GVINS) that fuses visual, inertial, and raw GNSS measurements within the square-root inverse sliding window filtering (SRI-SWF) framework in a tigh… ▽ More While Global Navigation Satellite System (GNSS) is often used to provide global positioning if available, its intermittency and/or inaccuracy calls for fusion with other sensors. In this paper, we develop a novel GNSS-Visual-Inertial Navigation System (GVINS) that fuses visual, inertial, and raw GNSS measurements within the square-root inverse sliding window filtering (SRI-SWF) framework in a tightly coupled fashion, which thus is termed SRI-GVINS. In particular, for the first time, we deeply fuse the GNSS pseudorange, Doppler shift, single-differenced pseudorange, and double-differenced carrier phase measurements, along with the visual-inertial measurements. Inherited from the SRI-SWF, the proposed SRI-GVINS gains significant numerical stability and computational efficiency over the start-of-the-art methods. Additionally, we propose to use a filter to sequentially initialize the reference frame transformation till converges, rather than collecting measurements for batch optimization. We also perform online calibration of GNSS-IMU extrinsic parameters to mitigate the possible extrinsic parameter degradation. The proposed SRI-GVINS is extensively evaluated on our own collected UAV datasets and the results demonstrate that the proposed method is able to suppress VIO drift in real-time and also show the effectiveness of online GNSS-IMU extrinsic calibration. The experimental validation on the public datasets further reveals that the proposed SRI-GVINS outperforms the state-of-the-art methods in terms of both accuracy and efficiency. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.02207 [pdf]

Water Structure and Electric Fields at the Interface of Oil Droplets

Authors: Lixue Shi, R. Allen LaCour, Xiaoqi Lang, Joseph P. Heindel, Teresa Head-Gordon, Wei Min

Abstract: Mesoscale water-hydrophobic interfaces are of fundamental importance in multiple disciplines, but their molecular properties have remained elusive for decades due to experimental complications and alternate theoretical explanations. Surface-specific spectroscopies, such as vibrational sum-frequency techniques, suffer from either sample preparation issues or the need for complex spectral correction… ▽ More Mesoscale water-hydrophobic interfaces are of fundamental importance in multiple disciplines, but their molecular properties have remained elusive for decades due to experimental complications and alternate theoretical explanations. Surface-specific spectroscopies, such as vibrational sum-frequency techniques, suffer from either sample preparation issues or the need for complex spectral corrections. Here, we report on a robust "in solution" interface-selective Raman spectroscopy approach using multivariate curve resolution to probe hexadecane in water emulsions. Computationally, we use the recently developed monomer field model for Raman spectroscopy to help interpret the interfacial spectra. Unlike with vibrational sum frequency techniques, our interfacial spectra are readily comparable to the spectra of bulk water, yielding new insights. The combination of experiment and theory show that the interface leads to reduced tetrahedral order and weaker hydrogen bonding, giving rise to a substantial water population with dangling OH at the interface. Additionally, the stretching mode of these free OH experiences a ~80 cm-1 red-shift due to a strong electric field which we attribute to the negative zeta potential that is general to oil droplets. These findings are either opposite to, or absent in, the molecular hydrophobic interface formed by small solutes. Together, water structural disorder and enhanced electrostatics are an emergent feature at the mesoscale interface of oil-water emulsions, with an estimated interfacial electric field of ~35-70 MV/cm that is important for chemical reactivity. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2404.06926 [pdf, other]

Gaussian-LIC: Photo-realistic LiDAR-Inertial-Camera SLAM with 3D Gaussian Splatting

Authors: Xiaolei Lang, Laijian Li, Hang Zhang, Feng Xiong, Mu Xu, Yong Liu, Xingxing Zuo, Jiajun Lv

Abstract: We present a real-time LiDAR-Inertial-Camera SLAM system with 3D Gaussian Splatting as the mapping backend. Leveraging robust pose estimates from our LiDAR-Inertial-Camera odometry, Coco-LIC, an incremental photo-realistic mapping system is proposed in this paper. We initialize 3D Gaussians from colorized LiDAR points and optimize them using differentiable rendering powered by 3D Gaussian Splattin… ▽ More We present a real-time LiDAR-Inertial-Camera SLAM system with 3D Gaussian Splatting as the mapping backend. Leveraging robust pose estimates from our LiDAR-Inertial-Camera odometry, Coco-LIC, an incremental photo-realistic mapping system is proposed in this paper. We initialize 3D Gaussians from colorized LiDAR points and optimize them using differentiable rendering powered by 3D Gaussian Splatting. Meticulously designed strategies are employed to incrementally expand the Gaussian map and adaptively control its density, ensuring high-quality mapping with real-time capability. Experiments conducted in diverse scenarios demonstrate the superior performance of our method compared to existing radiance-field-based SLAM systems. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: Submitted to IROS 2024

arXiv:2402.12289 [pdf, other]

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

Authors: Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Yang Wang, Zhiyong Zhao, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhao

Abstract: A primary hurdle of autonomous driving in urban environments is understanding complex and long-tail scenarios, such as challenging road conditions and delicate human behaviors. We introduce DriveVLM, an autonomous driving system leveraging Vision-Language Models (VLMs) for enhanced scene understanding and planning capabilities. DriveVLM integrates a unique combination of reasoning modules for scen… ▽ More A primary hurdle of autonomous driving in urban environments is understanding complex and long-tail scenarios, such as challenging road conditions and delicate human behaviors. We introduce DriveVLM, an autonomous driving system leveraging Vision-Language Models (VLMs) for enhanced scene understanding and planning capabilities. DriveVLM integrates a unique combination of reasoning modules for scene description, scene analysis, and hierarchical planning. Furthermore, recognizing the limitations of VLMs in spatial reasoning and heavy computational requirements, we propose DriveVLM-Dual, a hybrid system that synergizes the strengths of DriveVLM with the traditional autonomous driving pipeline. Experiments on both the nuScenes dataset and our SUP-AD dataset demonstrate the efficacy of DriveVLM and DriveVLM-Dual in handling complex and unpredictable driving conditions. Finally, we deploy the DriveVLM-Dual on a production vehicle, verifying it is effective in real-world autonomous driving environments. △ Less

Submitted 25 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: Project Page: https://tsinghua-mars-lab.github.io/DriveVLM/

arXiv:2401.01339 [pdf, other]

Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Authors: Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng

Abstract: This paper aims to tackle the problem of modeling dynamic urban streets for autonomous driving scenes. Recent methods extend NeRF by incorporating tracked vehicle poses to animate vehicles, enabling photo-realistic view synthesis of dynamic urban street scenes. However, significant limitations are their slow training and rendering speed. We introduce Street Gaussians, a new explicit scene represen… ▽ More This paper aims to tackle the problem of modeling dynamic urban streets for autonomous driving scenes. Recent methods extend NeRF by incorporating tracked vehicle poses to animate vehicles, enabling photo-realistic view synthesis of dynamic urban street scenes. However, significant limitations are their slow training and rendering speed. We introduce Street Gaussians, a new explicit scene representation that tackles these limitations. Specifically, the dynamic urban scene is represented as a set of point clouds equipped with semantic logits and 3D Gaussians, each associated with either a foreground vehicle or the background. To model the dynamics of foreground object vehicles, each object point cloud is optimized with optimizable tracked poses, along with a 4D spherical harmonics model for the dynamic appearance. The explicit representation allows easy composition of object vehicles and background, which in turn allows for scene editing operations and rendering at 135 FPS (1066 $\times$ 1600 resolution) within half an hour of training. The proposed method is evaluated on multiple challenging benchmarks, including KITTI and Waymo Open datasets. Experiments show that the proposed method consistently outperforms state-of-the-art methods across all datasets. The code will be released to ensure reproducibility. △ Less

Submitted 16 July, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

Comments: Project page: https://zju3dv.github.io/street_gaussians/

arXiv:2401.01065 [pdf, other]

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

Authors: Tao Tang, Dafeng Wei, Zhengyu Jia, Tian Gao, Changwei Cai, Chengkai Hou, Peng Jia, Kun Zhan, Haiyang Sun, Jingchen Fan, Yixing Zhao, Fu Liu, Xiaodan Liang, Xianpeng Lang, Yang Wang

Abstract: The rapid development of the autonomous driving industry has led to a significant accumulation of autonomous driving data. Consequently, there comes a growing demand for retrieving data to provide specialized optimization. However, directly applying previous image retrieval methods faces several challenges, such as the lack of global feature representation and inadequate text retrieval ability for… ▽ More The rapid development of the autonomous driving industry has led to a significant accumulation of autonomous driving data. Consequently, there comes a growing demand for retrieving data to provide specialized optimization. However, directly applying previous image retrieval methods faces several challenges, such as the lack of global feature representation and inadequate text retrieval ability for complex driving scenes. To address these issues, firstly, we propose the BEV-TSR framework which leverages descriptive text as an input to retrieve corresponding scenes in the Bird's Eye View (BEV) space. Then to facilitate complex scene retrieval with extensive text descriptions, we employ a large language model (LLM) to extract the semantic features of the text inputs and incorporate knowledge graph embeddings to enhance the semantic richness of the language embedding. To achieve feature alignment between the BEV feature and language embedding, we propose Shared Cross-modal Embedding with a set of shared learnable embeddings to bridge the gap between these two modalities, and employ a caption generation task to further enhance the alignment. Furthermore, there lack of well-formed retrieval datasets for effective evaluation. To this end, we establish a multi-level retrieval dataset, nuScenes-Retrieval, based on the widely adopted nuScenes dataset. Experimental results on the multi-level nuScenes-Retrieval show that BEV-TSR achieves state-of-the-art performance, e.g., 85.78% and 87.66% top-1 accuracy on scene-to-text and text-to-scene retrieval respectively. Codes and datasets will be available. △ Less

Submitted 18 June, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

arXiv:2309.09808 [pdf, other]

doi 10.1109/LRA.2023.3315542

Coco-LIC: Continuous-Time Tightly-Coupled LiDAR-Inertial-Camera Odometry using Non-Uniform B-spline

Authors: Xiaolei Lang, Chao Chen, Kai Tang, Yukai Ma, Jiajun Lv, Yong Liu, Xingxing Zuo

Abstract: In this paper, we propose an efficient continuous-time LiDAR-Inertial-Camera Odometry, utilizing non-uniform B-splines to tightly couple measurements from the LiDAR, IMU, and camera. In contrast to uniform B-spline-based continuous-time methods, our non-uniform B-spline approach offers significant advantages in terms of achieving real-time efficiency and high accuracy. This is accomplished by dyna… ▽ More In this paper, we propose an efficient continuous-time LiDAR-Inertial-Camera Odometry, utilizing non-uniform B-splines to tightly couple measurements from the LiDAR, IMU, and camera. In contrast to uniform B-spline-based continuous-time methods, our non-uniform B-spline approach offers significant advantages in terms of achieving real-time efficiency and high accuracy. This is accomplished by dynamically and adaptively placing control points, taking into account the varying dynamics of the motion. To enable efficient fusion of heterogeneous LiDAR-Inertial-Camera data within a short sliding-window optimization, we assign depth to visual pixels using corresponding map points from a global LiDAR map, and formulate frame-to-map reprojection factors for the associated pixels in the current image frame. This way circumvents the necessity for depth optimization of visual pixels, which typically entails a lengthy sliding window with numerous control points for continuous-time trajectory estimation. We conduct dedicated experiments on real-world datasets to demonstrate the advantage and efficacy of adopting non-uniform continuous-time trajectory representation. Our LiDAR-Inertial-Camera odometry system is also extensively evaluated on both challenging scenarios with sensor degenerations and large-scale scenarios, and has shown comparable or higher accuracy than the state-of-the-art methods. The codebase of this paper will also be open-sourced at https://github.com/APRIL-ZJU/Coco-LIC. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: has been accepted by RAL 2023

arXiv:2307.04070 [pdf, ps, other]

A Belief-Based Characterization of Reduced-Form Auctions

Authors: Xu Lang

Abstract: We study games of chance (e.g., pokers, dices, horse races) in the form of agents' first-order posterior beliefs about game outcomes. We ask for any profile of agents' posterior beliefs, is there a game that can generate these beliefs? We completely characterize all feasible joint posterior beliefs from these games. The characterization enables us to find a new variant of Border's inequalities (Bo… ▽ More We study games of chance (e.g., pokers, dices, horse races) in the form of agents' first-order posterior beliefs about game outcomes. We ask for any profile of agents' posterior beliefs, is there a game that can generate these beliefs? We completely characterize all feasible joint posterior beliefs from these games. The characterization enables us to find a new variant of Border's inequalities (Border, 1991), which we call a belief-based characterization of Border's inequalities. It also leads to a generalization of Aumann's Agreement Theorem. We show that the characterization results are powerful in bounding the correlation of agents' joint posterior beliefs. △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: Games of Chance, Posterior Beliefs, Reduced Form Auctions, Aumann's Agreement Theorem, Bayesian Persuasion

arXiv:2302.14350 [pdf, other]

Knowledge Augmented Relation Inference for Group Activity Recognition

Authors: Xianglong Lang, Zhuming Wang, Zun Li, Meng Tian, Ge Shi, Lifang Wu, Liang Wang

Abstract: Most existing group activity recognition methods construct spatial-temporal relations merely based on visual representation. Some methods introduce extra knowledge, such as action labels, to build semantic relations and use them to refine the visual presentation. However, the knowledge they explored just stay at the semantic-level, which is insufficient for pursing notable accuracy. In this paper,… ▽ More Most existing group activity recognition methods construct spatial-temporal relations merely based on visual representation. Some methods introduce extra knowledge, such as action labels, to build semantic relations and use them to refine the visual presentation. However, the knowledge they explored just stay at the semantic-level, which is insufficient for pursing notable accuracy. In this paper, we propose to exploit knowledge concretization for the group activity recognition, and develop a novel Knowledge Augmented Relation Inference framework that can effectively use the concretized knowledge to improve the individual representations. Specifically, the framework consists of a Visual Representation Module to extract individual appearance features, a Knowledge Augmented Semantic Relation Module explore semantic representations of individual actions, and a Knowledge-Semantic-Visual Interaction Module aims to integrate visual and semantic information by the knowledge. Benefiting from these modules, the proposed framework can utilize knowledge to enhance the relation inference process and the individual representations, thus improving the performance of group activity recognition. Experimental results on two public datasets show that the proposed framework achieves competitive performance compared with state-of-the-art methods. △ Less

Submitted 1 March, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

arXiv:2302.07456 [pdf, other]

Continuous-Time Fixed-Lag Smoothing for LiDAR-Inertial-Camera SLAM

Authors: Jiajun Lv, Xiaolei Lang, Jinhong Xu, Mengmeng Wang, Yong Liu, Xingxing Zuo

Abstract: Localization and mapping with heterogeneous multi-sensor fusion have been prevalent in recent years. To adequately fuse multi-modal sensor measurements received at different time instants and different frequencies, we estimate the continuous-time trajectory by fixed-lag smoothing within a factor-graph optimization framework. With the continuous-time formulation, we can query poses at any time inst… ▽ More Localization and mapping with heterogeneous multi-sensor fusion have been prevalent in recent years. To adequately fuse multi-modal sensor measurements received at different time instants and different frequencies, we estimate the continuous-time trajectory by fixed-lag smoothing within a factor-graph optimization framework. With the continuous-time formulation, we can query poses at any time instants corresponding to the sensor measurements. To bound the computation complexity of the continuous-time fixed-lag smoother, we maintain temporal and keyframe sliding windows with constant size, and probabilistically marginalize out control points of the trajectory and other states, which allows preserving prior information for future sliding-window optimization. Based on continuous-time fixed-lag smoothing, we design tightly-coupled multi-modal SLAM algorithms with a variety of sensor combinations, like the LiDAR-inertial and LiDAR-inertial-camera SLAM systems, in which online timeoffset calibration is also naturally supported. More importantly, benefiting from the marginalization and our derived analytical Jacobians for optimization, the proposed continuous-time SLAM systems can achieve real-time performance regardless of the high complexity of continuous-time formulation. The proposed multi-modal SLAM systems have been widely evaluated on three public datasets and self-collect datasets. The results demonstrate that the proposed continuous-time SLAM systems can achieve high-accuracy pose estimations and outperform existing state-of-the-art methods. To benefit the research community, we will open source our code at ~\url{https://github.com/APRIL-ZJU/clic}. △ Less

Submitted 14 February, 2023; originally announced February 2023.

arXiv:2211.06830 [pdf, ps, other]

Two-Person Bargaining when the Disagreement Point is Private Information

Authors: Eric van Damme, Xu Lang

Abstract: We consider two-person bargaining problems in which (only) the disagreement outcome is private (and possibly correlated) information and it is common knowledge that disagreement is inefficient. We show that if the Pareto frontier is linear, the outcome of an ex post efficient mechanism cannot depend on the disagreement payoffs. If the frontier is non-linear, the result continues to hold when the d… ▽ More We consider two-person bargaining problems in which (only) the disagreement outcome is private (and possibly correlated) information and it is common knowledge that disagreement is inefficient. We show that if the Pareto frontier is linear, the outcome of an ex post efficient mechanism cannot depend on the disagreement payoffs. If the frontier is non-linear, the result continues to hold when the disagreement payoffs are independent or there is a player with at most two types. We discuss implications of these results for axiomatic bargaining theory and for full surplus extraction in mechanism design. △ Less

Submitted 9 January, 2024; v1 submitted 13 November, 2022; originally announced November 2022.

Comments: bargaining problem, incomplete information, axiomatic method, efficiency, disagreement, correlation

arXiv:2208.12008 [pdf, other]

Ctrl-VIO: Continuous-Time Visual-Inertial Odometry for Rolling Shutter Cameras

Authors: Xiaolei Lang, Jiajun Lv, Jianxin Huang, Yukai Ma, Yong Liu, Xingxing Zuo

Abstract: In this paper, we propose a probabilistic continuous-time visual-inertial odometry (VIO) for rolling shutter cameras. The continuous-time trajectory formulation naturally facilitates the fusion of asynchronized high-frequency IMU data and motion-distorted rolling shutter images. To prevent intractable computation load, the proposed VIO is sliding-window and keyframe-based. We propose to probabilis… ▽ More In this paper, we propose a probabilistic continuous-time visual-inertial odometry (VIO) for rolling shutter cameras. The continuous-time trajectory formulation naturally facilitates the fusion of asynchronized high-frequency IMU data and motion-distorted rolling shutter images. To prevent intractable computation load, the proposed VIO is sliding-window and keyframe-based. We propose to probabilistically marginalize the control points to keep the constant number of keyframes in the sliding window. Furthermore, the line exposure time difference (line delay) of the rolling shutter camera can be online calibrated in our continuous-time VIO. To extensively examine the performance of our continuous-time VIO, experiments are conducted on publicly-available WHU-RSVI, TUM-RSVI, and SenseTime-RSVI rolling shutter datasets. The results demonstrate the proposed continuous-time VIO significantly outperforms the existing state-of-the-art VIO methods. The codebase of this paper will also be open-sourced at \url{https://github.com/APRIL-ZJU/Ctrl-VIO}. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Journal ref: 2022 RAL

arXiv:2207.09253 [pdf, ps, other]

Symmetric reduced form voting

Authors: Xu Lang, Debasis Mishra

Abstract: We study a model of voting with two alternatives in a symmetric environment. We characterize the interim allocation probabilities that can be implemented by a symmetric voting rule. We show that every such interim allocation probabilities can be implemented as a convex combination of two families of deterministic voting rules: qualified majority and qualified anti-majority. We also provide analogo… ▽ More We study a model of voting with two alternatives in a symmetric environment. We characterize the interim allocation probabilities that can be implemented by a symmetric voting rule. We show that every such interim allocation probabilities can be implemented as a convex combination of two families of deterministic voting rules: qualified majority and qualified anti-majority. We also provide analogous results by requiring implementation by a symmetric monotone (strategy-proof) voting rule and by a symmetric unanimous voting rule. We apply our results to show that an ex-ante Rawlsian rule is a convex combination of a pair of qualified majority rules. △ Less

Submitted 3 April, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

arXiv:2205.13442 [pdf, ps, other]

Rational points on $x^{3} + x^{2} y^{2} + y^{3} = k$

Authors: Xiaoan Lang, Jeremy Rouse

Abstract: We study the problem of determining, given an integer $k$, the rational solutions to $C_{k} : x^{3}z + x^{2} y^{2} + y^{3}z = kz^{4}$. For $k \ne 0$, the curve $C_{k}$ has genus $3$ and there are maps from $C_{k}$ to three elliptic curves $E_{1,k}$, $E_{2,k}$, $E_{3,k}$. We explicitly determine the rational points on $C_{k}$ under the assumption that one of these elliptic curves has rank zero. We… ▽ More We study the problem of determining, given an integer $k$, the rational solutions to $C_{k} : x^{3}z + x^{2} y^{2} + y^{3}z = kz^{4}$. For $k \ne 0$, the curve $C_{k}$ has genus $3$ and there are maps from $C_{k}$ to three elliptic curves $E_{1,k}$, $E_{2,k}$, $E_{3,k}$. We explicitly determine the rational points on $C_{k}$ under the assumption that one of these elliptic curves has rank zero. We discuss the challenges involved in extending our result to handle all $k \in \mathbb{Q}$. △ Less

Submitted 23 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

Comments: 18 pages

MSC Class: Primary 11G05; Secondary 11G30; 14H45

arXiv:2202.06245 [pdf, ps, other]

Reduced-Form Allocations with Complementarity: A 2-Person Case

Authors: Xu Lang

Abstract: We investigate the implementation of reduced-form allocation probabilities in a two-person bargaining problem without side payments, where the agents have to select one alternative from a finite set of social alternatives. We provide a necessary and sufficient condition for the implementability. We find that the implementability condition in bargaining has some new feature compared to Border's the… ▽ More We investigate the implementation of reduced-form allocation probabilities in a two-person bargaining problem without side payments, where the agents have to select one alternative from a finite set of social alternatives. We provide a necessary and sufficient condition for the implementability. We find that the implementability condition in bargaining has some new feature compared to Border's theorem. Our results have applications in compromise problems and package exchange problems where the agents barter indivisible objects and the agents value the objects as complements. △ Less

Submitted 22 February, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

Comments: 23 pages

arXiv:1501.00570 [pdf, ps, other]

Qubit detection with a T-shaped double quantum dot detector

Authors: JunYan Luo, HuJun Jiao, Jing Hu, Xiao-Ling He, XiaoLi Lang, Shi-Kuan Wang

Abstract: We propose to continuously monitor a charge qubit by utilizing a T-shaped double quantum dot detector, in which the qubit and double dot are arranged in such a unique way that the detector turns out to be particularly susceptible to the charge states of the qubit. Special attention is paid to the regime where acquisition of qubit information and backaction upon the measured system exhibit nontrivi… ▽ More We propose to continuously monitor a charge qubit by utilizing a T-shaped double quantum dot detector, in which the qubit and double dot are arranged in such a unique way that the detector turns out to be particularly susceptible to the charge states of the qubit. Special attention is paid to the regime where acquisition of qubit information and backaction upon the measured system exhibit nontrivial correlation. The intrinsic dynamics of the qubit gives rise to dynamical blockade of tunneling events through the detector, resulting in a super-Poissonian noise. However, such a pronounced enhancement of detector's shot noise does not necessarily produce a rising dephasing rate. In contrast, an inhibition of dephasing is entailed by the reduction of information acquisition in the dynamically blockaded regimes. We further reveal the important impact of the charge fluctuations on the measurement characteristics. Noticeably, under the condition of symmetric junction capacitances the noise pedestal of circuit current is completely suppressed, leading to a divergent signal-to-noise ratio, and eventually to a violation of the Korotkov-Averin bound in quantum measurement. Our study offers the possibility for a double dot detector to reach the quantum limited effectiveness in a transparent manner. △ Less

Submitted 15 January, 2015; v1 submitted 3 January, 2015; originally announced January 2015.

Comments: 6 figures, typoes corrected

arXiv:1406.2169 [pdf, other]

doi 10.1103/PhysRevB.89.035426

Inelastic electron tunneling spectroscopy of nanoporous gold films

Authors: H. W. Liu, R. Nishitani, T. Fujita, W. Li, L. Zhang, X. Y. Lang, P. Richard, K. S. Nakayama, X. Chen, M. W. Chen, Q. K. Xue

Abstract: We investigated the localized electronic properties of nanoporous gold films by using an ultra-high vacuum scanning tunneling microscope at low temperature (4.2 K). Second derivative scanning tunneling spectroscopy shows the plasmon peaks of the nanoporous gold films, which are excited by inelastic tunneling electrons. We propose that the nanorod model is appropriate for nanoporous gold studies at… ▽ More We investigated the localized electronic properties of nanoporous gold films by using an ultra-high vacuum scanning tunneling microscope at low temperature (4.2 K). Second derivative scanning tunneling spectroscopy shows the plasmon peaks of the nanoporous gold films, which are excited by inelastic tunneling electrons. We propose that the nanorod model is appropriate for nanoporous gold studies at the nanometer-scale. These results are supported by a 3D electron tomography analysis and theoretical calculations of nanoporous gold with ellipsoid shape. △ Less

Submitted 9 June, 2014; originally announced June 2014.

Comments: 6 pages, 3 figures. This is the authors' version. The published, high resolution version of this paper, Copyright (2014) by the American Physical Society, can be found at http://journals.aps.org/prb/

Journal ref: Physical Review B 89, 035426 (2014)

arXiv:1405.1662 [pdf]

Directly grown monolayer MoS2 on Au foils as efficient hydrogen evolution catalysts

Authors: Jianping Shi, Donglin Ma, Gao-Feng Han, Yu Zhang, Qingqing Ji, Teng Gao, Jingyu Sun, Cong Li, Xing-You Lang, Yanfeng Zhang, Zhongfan Liu

Abstract: Synthesis of monolayer MoS2 is essential for fulfilling the potential of MoS2 in catalysis, optoelectronics and valleytronics, etc. Herein, we report for the first time the scalable growth of high quality, domain size tunable (edge length from ~ 200 nm to 50 μm), strictly monolayer MoS2 on commercially available Au foils, via a low pressure chemical vapor deposition method. The nanosized triangula… ▽ More Synthesis of monolayer MoS2 is essential for fulfilling the potential of MoS2 in catalysis, optoelectronics and valleytronics, etc. Herein, we report for the first time the scalable growth of high quality, domain size tunable (edge length from ~ 200 nm to 50 μm), strictly monolayer MoS2 on commercially available Au foils, via a low pressure chemical vapor deposition method. The nanosized triangular MoS2 flakes on Au foils was proved to be an excellent electrocatalyst for hydrogen evolution reaction (HER), featured by a rather low Tafel slope (61 mV/decade) and a supreme exchange current density (38.1 μA/cm2). The abundant active edge sites and the excellent electron coupling between MoS2 and Au foils account for the extraordinary HER activity. Our work presents a sound proof that strictly monolayer MoS2 assembled on a well selected electrode can manifest comparable or even superior HER property than that of nanoparticles or few-layer MoS2 electrocatalyst. △ Less

Submitted 7 May, 2014; originally announced May 2014.

Comments: 28 pages, 5 figures

arXiv:1302.5638 [pdf, ps, other]

doi 10.1016/j.physleta.2014.01.031

Conditional spin counting statistics as a probe of Coulomb interaction and spin-resolved bunching

Authors: JunYan Luo, Jing Hu, XiaoLi Lang, Yu Shen, Xiao-Ling He, HuJun Jiao

Abstract: Full counting statistics is a powerful tool to characterize the noise and correlations in transport through mesoscopic systems. In this work, we propose the theory of conditional spin counting statistics, i.e., the statistical fluctuations of spin-up (down) current given the observation of the spin-down (up) current. In the context of transport through a single quantum dot, it is demonstrated that… ▽ More Full counting statistics is a powerful tool to characterize the noise and correlations in transport through mesoscopic systems. In this work, we propose the theory of conditional spin counting statistics, i.e., the statistical fluctuations of spin-up (down) current given the observation of the spin-down (up) current. In the context of transport through a single quantum dot, it is demonstrated that a strong Coulomb interaction leads to a conditional spin counting statistics that exhibits a substantial change in comparison to that without Coulomb repulsion. It thus can be served as an effective way to probe the Coulomb interactions in mesoscopic transport systems. In case of spin polarized transport, it is further shown that the conditional spin counting statistics offers a transparent tool to reveal the spin-resolved bunching behavior. △ Less

Submitted 16 February, 2014; v1 submitted 22 February, 2013; originally announced February 2013.

Comments: 9 pages, 7 figures

Journal ref: Phys. Lett. A 378, 892-898 (2014)

arXiv:1203.2233 [pdf, ps, other]

doi 10.1063/1.4828870

Non-Markovian dynamics and noise characteristics in continuous measurement of a solid-state charge qubit

Authors: JunYan Luo, HuJun Jiao, Xiao-Li Lang, BiTao Xiong, Xiao-Ling He

Abstract: We investigate the non-Markovian characteristics in continuous measurement of a charge qubit by a quantum point contact. The backflow of information from the reservoir to the system in the non-Markovian domain gives rise to strikingly different qubit relaxation and dephasing in comparison with the Markovian case. The intriguing non-Markovian dynamics is found to have a direct impact on the output… ▽ More We investigate the non-Markovian characteristics in continuous measurement of a charge qubit by a quantum point contact. The backflow of information from the reservoir to the system in the non-Markovian domain gives rise to strikingly different qubit relaxation and dephasing in comparison with the Markovian case. The intriguing non-Markovian dynamics is found to have a direct impact on the output noise feature of the detector. Unambiguously, we observe that the non-Markovian memory effect results in an enhancement of the signal-to-noise ratio, which can even exceed the upper limit of ``4'', leading thus to the violation of the Korotkov-Averin bound in quantum measurement. Our study thus may open new possibilities to improve detector's measurement efficiency in a direct and transparent way. △ Less

Submitted 17 August, 2013; v1 submitted 10 March, 2012; originally announced March 2012.

Comments: 10 pages, 7 figures

Showing 1–24 of 24 results for author: Lang, X