subscribe to arXiv mailings

Fast and Accurate Multi-Agent Trajectory Prediction For Crowded Unknown Scenes

Authors: Xiuye Tao, Huiping Li, Bin Liang, Yang Shi, Demin Xu

Abstract: This paper studies the problem of multi-agent trajectory prediction in crowded unknown environments. A novel energy function optimization-based framework is proposed to generate prediction trajectories. Firstly, a new energy function is designed for easier optimization. Secondly, an online optimization pipeline for calculating parameters and agents' velocities is developed. In this pipeline, we fi… ▽ More This paper studies the problem of multi-agent trajectory prediction in crowded unknown environments. A novel energy function optimization-based framework is proposed to generate prediction trajectories. Firstly, a new energy function is designed for easier optimization. Secondly, an online optimization pipeline for calculating parameters and agents' velocities is developed. In this pipeline, we first design an efficient group division method based on Frechet distance to classify agents online. Then the strategy on decoupling the optimization of velocities and critical parameters in the energy function is developed, where the the slap swarm algorithm and gradient descent algorithms are integrated to solve the optimization problems more efficiently. Thirdly, we propose a similarity-based resample evaluation algorithm to predict agents' optimal goals, defined as the target-moving headings of agents, which effectively extracts hidden information in observed states and avoids learning agents' destinations via the training dataset in advance. Experiments and comparison studies verify the advantages of the proposed method in terms of prediction accuracy and speed. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.07125 [pdf, other]

Rapid Parameter Estimation for Merging Massive Black Hole Binaries Using ODE-Based Generative Models

Authors: Bo Liang, Minghui Du, He Wang, Yuxiang Xu, Chang Liu, Xiaotong Wei, Peng Xu, Li-e Qiang, Ziren Luo

Abstract: Detecting the coalescences of massive black hole binaries (MBHBs) is one of the primary targets for space-based gravitational wave observatories such as LISA, Taiji, and Tianqin. The fast and accurate parameter estimation of merging MBHBs is of great significance for both astrophysics and the global fitting of all resolvable sources. However, such analyses entail significant computational costs. T… ▽ More Detecting the coalescences of massive black hole binaries (MBHBs) is one of the primary targets for space-based gravitational wave observatories such as LISA, Taiji, and Tianqin. The fast and accurate parameter estimation of merging MBHBs is of great significance for both astrophysics and the global fitting of all resolvable sources. However, such analyses entail significant computational costs. To address these challenges, inspired by the latest progress in generative models, we proposed a novel artificial intelligence (AI) based parameter estimation method called Variance Preserving Flow Matching Posterior Estimation (VPFMPE). Specifically, we utilize triangular interpolation to maintain variance over time, thereby constructing a transport path for training continuous normalization flows. Compared to the simple linear interpolation method used in flow matching to construct the optimal transport path, our approach better captures continuous temporal variations, making it more suitable for the parameter estimation of MBHBs. Additionally, we creatively introduce a parameter transformation method based on the symmetry in the detector's response function. This transformation is integrated within VPFMPE, allowing us to train the model using a simplified dataset, and then perform parameter estimation on more general data, hence also acting as a crucial factor in improving the training speed. In conclusion, for the first time, within a comprehensive and reasonable parameter range, we have achieved a complete and unbiased 11-dimensional rapid inference for MBHBs in the presence of astrophysical confusion noise using ODE-based generative models. In the experiments based on simulated data, our model produces posterior distributions comparable to those obtained by nested sampling. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2406.09779 [pdf, other]

doi 10.1145/3589335.3665995

OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst

Authors: Jingtao Cao, Zheng Zhang, Hongru Wang, Bin Liang, Hao Wang, Kam-Fai Wong

Abstract: Memes, which rapidly disseminate personal opinions and positions across the internet, also pose significant challenges in propagating social bias and prejudice. This study presents a novel approach to detecting harmful memes, particularly within the multicultural and multilingual context of Singapore. Our methodology integrates image captioning, Optical Character Recognition (OCR), and Large Langu… ▽ More Memes, which rapidly disseminate personal opinions and positions across the internet, also pose significant challenges in propagating social bias and prejudice. This study presents a novel approach to detecting harmful memes, particularly within the multicultural and multilingual context of Singapore. Our methodology integrates image captioning, Optical Character Recognition (OCR), and Large Language Model (LLM) analysis to comprehensively understand and classify harmful memes. Utilizing the BLIP model for image captioning, PP-OCR and TrOCR for text recognition across multiple languages, and the Qwen LLM for nuanced language understanding, our system is capable of identifying harmful content in memes created in English, Chinese, Malay, and Tamil. To enhance the system's performance, we fine-tuned our approach by leveraging additional data labeled using GPT-4V, aiming to distill the understanding capability of GPT-4V for harmful memes to our system. Our framework achieves top-1 at the public leaderboard of the Online Safety Prize Challenge hosted by AI Singapore, with the AUROC as 0.7749 and accuracy as 0.7087, significantly ahead of the other teams. Notably, our approach outperforms previous benchmarks, with FLAVA achieving an AUROC of 0.5695 and VisualBERT an AUROC of 0.5561. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.08587 [pdf, other]

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Authors: Xiaoshuai Song, Muxi Diao, Guanting Dong, Zhengyang Wang, Yujia Fu, Runqi Qiao, Zhexu Wang, Dayuan Fu, Huangxuan Wu, Bin Liang, Weihao Zeng, Yejie Wang, Zhuoma GongQue, Jianing Yu, Qiuna Tan, Weiran Xu

Abstract: Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on benchmarks for analyzing specific foundational skills (e.g. mathematics and code generation), neglecting an all-round evaluation of the computer scie… ▽ More Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on benchmarks for analyzing specific foundational skills (e.g. mathematics and code generation), neglecting an all-round evaluation of the computer science field. To bridge this gap, we introduce CS-Bench, the first bilingual (Chinese-English) benchmark dedicated to evaluating the performance of LLMs in computer science. CS-Bench comprises approximately 5K meticulously curated test samples, covering 26 subfields across 4 key areas of computer science, encompassing various task forms and divisions of knowledge and reasoning. Utilizing CS-Bench, we conduct a comprehensive evaluation of over 30 mainstream LLMs, revealing the relationship between CS performance and model scales. We also quantitatively analyze the reasons for failures in existing LLMs and highlight directions for improvements, including knowledge supplementation and CS-specific reasoning. Further cross-capability experiments show a high correlation between LLMs' capabilities in computer science and their abilities in mathematics and coding. Moreover, expert LLMs specialized in mathematics and coding also demonstrate strong performances in several CS subfields. Looking ahead, we envision CS-Bench serving as a cornerstone for LLM applications in the CS field and paving new avenues in assessing LLMs' diverse reasoning capabilities. The CS-Bench data and evaluation code are available at https://github.com/csbench/csbench. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Work in progress

arXiv:2406.03102 [pdf, other]

DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays

Authors: Bo Xia, Yilun Kong, Yongzhe Chang, Bo Yuan, Zhiheng Li, Xueqian Wang, Bin Liang

Abstract: Classic reinforcement learning (RL) frequently confronts challenges in tasks involving delays, which cause a mismatch between received observations and subsequent actions, thereby deviating from the Markov assumption. Existing methods usually tackle this issue with end-to-end solutions using state augmentation. However, these black-box approaches often involve incomprehensible processes and redund… ▽ More Classic reinforcement learning (RL) frequently confronts challenges in tasks involving delays, which cause a mismatch between received observations and subsequent actions, thereby deviating from the Markov assumption. Existing methods usually tackle this issue with end-to-end solutions using state augmentation. However, these black-box approaches often involve incomprehensible processes and redundant information in the information states, causing instability and potentially undermining the overall performance. To alleviate the delay challenges in RL, we propose $\textbf{DEER (Delay-resilient Encoder-Enhanced RL)}$, a framework designed to effectively enhance the interpretability and address the random delay issues. DEER employs a pretrained encoder to map delayed states, along with their variable-length past action sequences resulting from different delays, into hidden states, which is trained on delay-free environment datasets. In a variety of delayed scenarios, the trained encoder can seamlessly integrate with standard RL algorithms without requiring additional modifications and enhance the delay-solving capability by simply adapting the input dimension of the original algorithms. We evaluate DEER through extensive experiments on Gym and Mujoco environments. The results confirm that DEER is superior to state-of-the-art RL algorithms in both constant and random delay settings. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.14202 [pdf]

Giant Acoustic Geometric Spin and Orbital Hall Effect

Authors: Wei Wang, Yang Tan, Jingjing Liu, Bin Liang, Jianchun Cheng

Abstract: Acoustic waves in fluid with spin-0 nature have been long believed not to support spin Hall effect and strong orbital Hall effect that enables experimental observation. Here we report the first theoretical explication and experimental demonstration of giant acoustic geometric spin and orbital Hall effect characterized by a large transverse shift. We reveal that this effect occurs when a vortex bea… ▽ More Acoustic waves in fluid with spin-0 nature have been long believed not to support spin Hall effect and strong orbital Hall effect that enables experimental observation. Here we report the first theoretical explication and experimental demonstration of giant acoustic geometric spin and orbital Hall effect characterized by a large transverse shift. We reveal that this effect occurs when a vortex beam is observed from a tilted reference frame free of wave-interface interactions or gradient-index media needed for observing conventional ones, and can be amplified by simply binding the beam tightly. Thanks to this mechanism, large transverse shifts proportional to angular momentum are observed in a compact system. Our work provides deeper insights into the physics of angular momentum of classic waves. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.11778 [pdf, other]

Efficient Multi-agent Reinforcement Learning by Planning

Authors: Qihan Liu, Jianing Ye, Xiaoteng Ma, Jun Yang, Bin Liang, Chongjie Zhang

Abstract: Multi-agent reinforcement learning (MARL) algorithms have accomplished remarkable breakthroughs in solving large-scale decision-making tasks. Nonetheless, most existing MARL algorithms are model-free, limiting sample efficiency and hindering their applicability in more challenging scenarios. In contrast, model-based reinforcement learning (MBRL), particularly algorithms integrating planning, such… ▽ More Multi-agent reinforcement learning (MARL) algorithms have accomplished remarkable breakthroughs in solving large-scale decision-making tasks. Nonetheless, most existing MARL algorithms are model-free, limiting sample efficiency and hindering their applicability in more challenging scenarios. In contrast, model-based reinforcement learning (MBRL), particularly algorithms integrating planning, such as MuZero, has demonstrated superhuman performance with limited data in many tasks. Hence, we aim to boost the sample efficiency of MARL by adopting model-based approaches. However, incorporating planning and search methods into multi-agent systems poses significant challenges. The expansive action space of multi-agent systems often necessitates leveraging the nearly-independent property of agents to accelerate learning. To tackle this issue, we propose the MAZero algorithm, which combines a centralized model with Monte Carlo Tree Search (MCTS) for policy search. We design a novel network structure to facilitate distributed execution and parameter sharing. To enhance search efficiency in deterministic environments with sizable action spaces, we introduce two novel techniques: Optimistic Search Lambda (OS($λ$)) and Advantage-Weighted Policy Optimization (AWPO). Extensive experiments on the SMAC benchmark demonstrate that MAZero outperforms model-free approaches in terms of sample efficiency and provides comparable or better performance than existing model-based methods in terms of both sample and computational efficiency. Our code is available at https://github.com/liuqh16/MAZero. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: ICLR2024

arXiv:2405.07687 [pdf, other]

Highly Efficient Observation Process based on FFT Filtering for Robot Swarm Collaborative Navigation in Unknown Environments

Authors: Chenxi Li, Weining Lu, Zhihao Ma, Litong Meng, Bin Liang

Abstract: Collaborative path planning for robot swarms in complex, unknown environments without external positioning is a challenging problem. This requires robots to find safe directions based on real-time environmental observations, and to efficiently transfer and fuse these observations within the swarm. This study presents a filtering method based on Fast Fourier Transform (FFT) to address these two iss… ▽ More Collaborative path planning for robot swarms in complex, unknown environments without external positioning is a challenging problem. This requires robots to find safe directions based on real-time environmental observations, and to efficiently transfer and fuse these observations within the swarm. This study presents a filtering method based on Fast Fourier Transform (FFT) to address these two issues. We treat sensors' environmental observations as a digital sampling process. Then, we design two different types of filters for safe direction extraction, as well as for the compression and reconstruction of environmental data. The reconstructed data is mapped to probabilistic domain, achieving efficient fusion of swarm observations and planning decision. The computation time is only on the order of microseconds, and the transmission data in communication systems is in bit-level. The performance of our algorithm in sensor data processing was validated in real world experiments, and the effectiveness in swarm path optimization was demonstrated through extensive simulations. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 8 figures, 1 table

arXiv:2405.07218 [pdf, other]

Chained Flexible Capsule Endoscope: Unraveling the Conundrum of Size Limitations and Functional Integration for Gastrointestinal Transitivity

Authors: Sishen Yuan, Guang Li, Baijia Liang, Lailu Li, Qingzhuo Zheng, Shuang Song, Zhen Li, Hongliang Ren

Abstract: Capsule endoscopes, predominantly serving diagnostic functions, provide lucid internal imagery but are devoid of surgical or therapeutic capabilities. Consequently, despite lesion detection, physicians frequently resort to traditional endoscopic or open surgical procedures for treatment, resulting in more complex, potentially risky interventions. To surmount these limitations, this study introduce… ▽ More Capsule endoscopes, predominantly serving diagnostic functions, provide lucid internal imagery but are devoid of surgical or therapeutic capabilities. Consequently, despite lesion detection, physicians frequently resort to traditional endoscopic or open surgical procedures for treatment, resulting in more complex, potentially risky interventions. To surmount these limitations, this study introduces a chained flexible capsule endoscope (FCE) design concept, specifically conceived to navigate the inherent volume constraints of capsule endoscopes whilst augmenting their therapeutic functionalities. The FCE's distinctive flexibility originates from a conventional rotating joint design and the incision pattern in the flexible material. In vitro experiments validated the passive navigation ability of the FCE in rugged intestinal tracts. Further, the FCE demonstrates consistent reptile-like peristalsis under the influence of an external magnetic field, and possesses the capability for film expansion and disintegration under high-frequency electromagnetic stimulation. These findings illuminate a promising path toward amplifying the therapeutic capacities of capsule endoscopes without necessitating a size compromise. △ Less

Submitted 12 May, 2024; originally announced May 2024.

arXiv:2405.07216 [pdf, other]

Magnetic-Guided Flexible Origami Robot toward Long-Term Phototherapy of H. pylori in the Stomach

Authors: Sishen Yuan, Baijia Liang, Po Wa Wong, Mingjing Xu, Chi Hsuan Li, Zhen Li, Hongliang Ren

Abstract: Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (P… ▽ More Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (PDT) emerges as a promising prospect in this context. This study presents the development and implementation of a magnetically-guided origami robot, incorporating flexible printed circuit units for sustained and stable phototherapy of Helicobacter pylori. Each integrated unit is equipped with wireless charging capabilities, producing an optimal power output that can concurrently illuminate up to 15 LEDs at their maximum intensity. Crucially, these units can be remotely manipulated via a magnetic field, facilitating both translational and rotational movements. We propose an open-loop manual control sequence that allows the formation of a stable, compliant triangular structure through the interaction of internal magnets. This adaptable configuration is uniquely designed to withstand the dynamic squeezing environment prevalent in real-world gastric applications. The research herein represents a significant stride in leveraging technology for innovative medical solutions, particularly in the management of antibiotic-resistant Helicobacter pylori infections. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: IEEE ICRA 2024

arXiv:2404.18225 [pdf, other]

Quadruped robot traversing 3D complex environments with limited perception

Authors: Yi Cheng, Hang Liu, Guoping Pan, Linqi Ye, Houde Liu, Bin Liang

Abstract: Traversing 3-D complex environments has always been a significant challenge for legged locomotion. Existing methods typically rely on external sensors such as vision and lidar to preemptively react to obstacles by acquiring environmental information. However, in scenarios like nighttime or dense forests, external sensors often fail to function properly, necessitating robots to rely on propriocepti… ▽ More Traversing 3-D complex environments has always been a significant challenge for legged locomotion. Existing methods typically rely on external sensors such as vision and lidar to preemptively react to obstacles by acquiring environmental information. However, in scenarios like nighttime or dense forests, external sensors often fail to function properly, necessitating robots to rely on proprioceptive sensors to perceive diverse obstacles in the environment and respond promptly. This task is undeniably challenging. Our research finds that methods based on collision detection can enhance a robot's perception of environmental obstacles. In this work, we propose an end-to-end learning-based quadruped robot motion controller that relies solely on proprioceptive sensing. This controller can accurately detect, localize, and agilely respond to collisions in unknown and complex 3D environments, thereby improving the robot's traversability in complex environments. We demonstrate in both simulation and real-world experiments that our method enables quadruped robots to successfully traverse challenging obstacles in various complex environments. △ Less

Submitted 14 July, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

Comments: 10 pages, 8 figures,submitted to iros2024

arXiv:2404.10373 [pdf]

A newly developed multi-kilo-channel high-speed and precision waveform digitization system for neutrino experiments

Authors: H. Yang, T. Xue, L. Jiang, C. Xu, Q. Pan, B. Liang, G. Gong, B. Xu, Z. Wang, S. Chen, Y. Liu, J. Li

Abstract: The Jinping Neutrino Experiment(JNE), conducted within the China Jinping Underground Laboratory, aims to detect and analyze of solar neutrinos, geo-neutrinos, and supernova neutrinos. A one-ton prototype will soon be in commision with an upgrade from 30 channels to 60 channels, which will increase the data bandwidth by one to two orders of magnitude and exceed the capacity of the current CAEN DAQ… ▽ More The Jinping Neutrino Experiment(JNE), conducted within the China Jinping Underground Laboratory, aims to detect and analyze of solar neutrinos, geo-neutrinos, and supernova neutrinos. A one-ton prototype will soon be in commision with an upgrade from 30 channels to 60 channels, which will increase the data bandwidth by one to two orders of magnitude and exceed the capacity of the current CAEN DAQ system. Additionally, enhancing the performance and flexibility of JNE DAQ system is crucial. This paper presents the design of a new Tsinghua DAQ system for the JNE and its performance and stability. The new Tsinghua DAQ(THDAQ) system for JNE is based on the cPCI protocol and demonstrates powerful performance improvements: ADC ENOB of the THDAQ system approximately exceeds 9.8-bit, marking a 14% improvement over the CAEN DAQ system; The maximum clock deviation within a single chassis is 85.6 ps, satisfying sub-nanosecond synchronization criteria; Each DAQ board features two QSFP+ optical ports with 82.5Gbps transmission capability, while the PCIe board supports a transmission rate of 100.2 Gbps. In addition, comparative experiments between the two systems were also tested in detail. The analysis results of waveform and charge spectrum prove the high stability of the THDAQ system. This provides a foundation for the 60-channel and 4000-channel DAQ systems. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.08246 [pdf]

Agile and versatile bipedal robot tracking control through reinforcement learning

Authors: Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

Abstract: The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To repli… ▽ More The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.00277 [pdf, other]

doi 10.1093/mnras/stae868

Radio Frequency Interference Detection Using Efficient Multi-Scale Convolutional Attention UNet

Authors: Fei Gu, Longfei Hao, Bo Liang, Song Feng, Shoulin Wei, Wei Dai, Yonghua Xu, Zhixuan Li, Yihang Dao

Abstract: Studying the universe through radio telescope observation is crucial. However, radio telescopes capture not only signals from the universe but also various interfering signals, known as Radio Frequency Interference (RFI). The presence of RFI can significantly impact data analysis. Ensuring the accuracy, reliability, and scientific integrity of research findings by detecting and mitigating or elimi… ▽ More Studying the universe through radio telescope observation is crucial. However, radio telescopes capture not only signals from the universe but also various interfering signals, known as Radio Frequency Interference (RFI). The presence of RFI can significantly impact data analysis. Ensuring the accuracy, reliability, and scientific integrity of research findings by detecting and mitigating or eliminating RFI in observational data, presents a persistent challenge in radio astronomy. In this study, we proposed a novel deep learning model called EMSCA-UNet for RFI detection. The model employs multi-scale convolutional operations to extract RFI features of various scale sizes. Additionally, an attention mechanism is utilized to assign different weights to the extracted RFI feature maps, enabling the model to focus on vital features for RFI detection. We evaluated the performance of the model using real data observed from the 40-meter radio telescope at Yunnan Observatory. Furthermore, we compared our results to other models, including U-Net, RFI-Net, and R-Net, using four commonly employed evaluation metrics: precision, recall, F1 score, and IoU. The results demonstrate that our model outperforms the other models on all evaluation metrics, achieving an average improvement of approximately 5\% compared to U-Net. Our model not only enhances the accuracy and comprehensiveness of RFI detection but also provides more detailed edge detection while minimizing the loss of useful signals. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.20001 [pdf, other]

Adaptive Energy Regularization for Autonomous Gait Transition and Energy-Efficient Quadruped Locomotion

Authors: Boyuan Liang, Lingfeng Sun, Xinghao Zhu, Bike Zhang, Ziyin Xiong, Chenran Li, Koushil Sreenath, Masayoshi Tomizuka

Abstract: In reinforcement learning for legged robot locomotion, crafting effective reward strategies is crucial. Pre-defined gait patterns and complex reward systems are widely used to stabilize policy training. Drawing from the natural locomotion behaviors of humans and animals, which adapt their gaits to minimize energy consumption, we propose a simplified, energy-centric reward strategy to foster the de… ▽ More In reinforcement learning for legged robot locomotion, crafting effective reward strategies is crucial. Pre-defined gait patterns and complex reward systems are widely used to stabilize policy training. Drawing from the natural locomotion behaviors of humans and animals, which adapt their gaits to minimize energy consumption, we propose a simplified, energy-centric reward strategy to foster the development of energy-efficient locomotion across various speeds in quadruped robots. By implementing an adaptive energy reward function and adjusting the weights based on velocity, we demonstrate that our approach enables ANYmal-C and Unitree Go1 robots to autonomously select appropriate gaits, such as four-beat walking at lower speeds and trotting at higher speeds, resulting in improved energy efficiency and stable velocity tracking compared to previous methods using complex reward designs and prior gait knowledge. The effectiveness of our policy is validated through simulations in the IsaacGym simulation environment and on real robots, demonstrating its potential to facilitate stable and adaptive locomotion. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures

arXiv:2403.18960 [pdf, other]

Robust In-Hand Manipulation with Extrinsic Contacts

Authors: Boyuan Liang, Kei Ota, Masayoshi Tomizuka, Devesh Jha

Abstract: We present in-hand manipulation tasks where a robot moves an object in grasp, maintains its external contact mode with the environment, and adjusts its in-hand pose simultaneously. The proposed manipulation task leads to complex contact interactions which can be very susceptible to uncertainties in kinematic and physical parameters. Therefore, we propose a robust in-hand manipulation method, which… ▽ More We present in-hand manipulation tasks where a robot moves an object in grasp, maintains its external contact mode with the environment, and adjusts its in-hand pose simultaneously. The proposed manipulation task leads to complex contact interactions which can be very susceptible to uncertainties in kinematic and physical parameters. Therefore, we propose a robust in-hand manipulation method, which consists of two parts. First, an in-gripper mechanics model that computes a naïve motion cone assuming all parameters are precise. Then, a robust planning method refines the motion cone to maintain desired contact mode regardless of parametric errors. Real-world experiments were conducted to illustrate the accuracy of the mechanics model and the effectiveness of the robust planning framework in the presence of kinematics parameter errors. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Accepted at ICRA 24

arXiv:2403.12676 [pdf, other]

In-Hand Following of Deformable Linear Objects Using Dexterous Fingers with Tactile Sensing

Authors: Mingrui Yu, Boyuan Liang, Xiang Zhang, Xinghao Zhu, Xiang Li, Masayoshi Tomizuka

Abstract: Most research on deformable linear object (DLO) manipulation assumes rigid grasping. However, beyond rigid grasping and re-grasping, in-hand following is also an essential skill that humans use to dexterously manipulate DLOs, which requires continuously changing the grasp point by in-hand sliding while holding the DLO to prevent it from falling. Achieving such a skill is very challenging for robot… ▽ More Most research on deformable linear object (DLO) manipulation assumes rigid grasping. However, beyond rigid grasping and re-grasping, in-hand following is also an essential skill that humans use to dexterously manipulate DLOs, which requires continuously changing the grasp point by in-hand sliding while holding the DLO to prevent it from falling. Achieving such a skill is very challenging for robots without using specially designed but not versatile end-effectors. Previous works have attempted using generic parallel grippers, but their robustness is unsatisfactory owing to the conflict between following and holding, which is hard to balance with a one-degree-of-freedom gripper. In this work, inspired by how humans use fingers to follow DLOs, we explore the usage of a generic dexterous hand with tactile sensing to imitate human skills and achieve robust in-hand DLO following. To enable the hardware system to function in the real world, we develop a framework that includes Cartesian-space arm-hand control, tactile-based in-hand 3-D DLO pose estimation, and task-specific motion design. Experimental results demonstrate the significant superiority of our method over using parallel grippers, as well as its great robustness, generalizability, and efficiency. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12035 [pdf, other]

CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility

Authors: Bojia Zi, Shihao Zhao, Xianbiao Qi, Jianan Wang, Yukai Shi, Qianyu Chen, Bin Liang, Kam-Fai Wong, Lei Zhang

Abstract: Recent advancements in video generation have been remarkable, yet many existing methods struggle with issues of consistency and poor text-video alignment. Moreover, the field lacks effective techniques for text-guided video inpainting, a stark contrast to the well-explored domain of text-guided image inpainting. To this end, this paper proposes a novel text-guided video inpainting model that achie… ▽ More Recent advancements in video generation have been remarkable, yet many existing methods struggle with issues of consistency and poor text-video alignment. Moreover, the field lacks effective techniques for text-guided video inpainting, a stark contrast to the well-explored domain of text-guided image inpainting. To this end, this paper proposes a novel text-guided video inpainting model that achieves better consistency, controllability and compatibility. Specifically, we introduce a simple but efficient motion capture module to preserve motion consistency, and design an instance-aware region selection instead of a random region selection to obtain better textual controllability, and utilize a novel strategy to inject some personalized models into our CoCoCo model and thus obtain better model compatibility. Extensive experiments show that our model can generate high-quality video clips. Meanwhile, our model shows better motion consistency, textual controllability and model compatibility. More details are shown in [cococozibojia.github.io](cococozibojia.github.io). △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10002 [pdf, ps, other]

Fast Group Scheduling for Downlink Large-Scale Multi-Group Multicast Beamforming

Authors: Chong Zhang, Min Dong, Ben Liang, Ali Afana, Yahia Ahmed

Abstract: Next-generation wireless networks need to handle massive user access effectively. This paper addresses the problem of joint group scheduling and multicast beamforming for downlink transmission with many active user groups. Aiming to maximize the minimum user throughput, we propose a three-phase approach to tackle this difficult joint optimization problem efficiently. In Phase 1, we utilize the opt… ▽ More Next-generation wireless networks need to handle massive user access effectively. This paper addresses the problem of joint group scheduling and multicast beamforming for downlink transmission with many active user groups. Aiming to maximize the minimum user throughput, we propose a three-phase approach to tackle this difficult joint optimization problem efficiently. In Phase 1, we utilize the optimal multicast beamforming structure obtained recently to find the group-channel directions for all groups. We propose two low-complexity group scheduling algorithms in Phase 2, which determine the subset of groups in each time slot sequentially and the total number of time slots required for all groups. The first algorithm measures the level of spatial separation among groups and selects the dissimilar groups that maximize the minimum user rate into the same time slot. In contrast, the second algorithm first identifies the spatially correlated groups via a learning-based clustering method based on the group-channel directions, and then separates spatially similar groups into different time slots. Finally, the multicast beamformers for the scheduled groups are obtained in each time slot by a computationally efficient method. Simulation results show that our proposed scheduling methods can effectively capture the level of spatial separation among groups to improve the minimum user throughput over the conventional approach that serves all groups in a single time slot or one group per time slot, and can be executed with low computational complexity. △ Less

Submitted 24 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: 13 pages, 8 figures

arXiv:2403.06085 [pdf, other]

van Hove Singularity-Driven Emergence of Multiple Flat Bands in Kagome Superconductors

Authors: Hailan Luo, Lin Zhao, Zhen Zhao, Haitao Yang, Yun-Peng Huang, Hongxiong Liu, Yuhao Gu, Feng Jin, Hao Chen, Taimin Miao, Chaohui Yin, Chengmin Shen, Xiaolin Ren, Bo Liang, Yingjie Shu, Yiwen Chen, Fengfeng Zhang, Feng Yang, Shenjin Zhang, Qinjun Peng, Hanqing Mao, Guodong Liu, Jiangping Hu, Youguo Shi, Zuyan Xu , et al. (5 additional authors not shown)

Abstract: The newly discovered Kagome superconductors AV$_3$Sb$_5$ (A=K, Rb and Cs) continue to bring surprises in generating unusual phenomena and physical properties, including anomalous Hall effect, unconventional charge density wave, electronic nematicity and time-reversal symmetry breaking. Here we report an unexpected emergence of multiple flat bands in the AV$_3$Sb$_5$ superconductors. By performing… ▽ More The newly discovered Kagome superconductors AV$_3$Sb$_5$ (A=K, Rb and Cs) continue to bring surprises in generating unusual phenomena and physical properties, including anomalous Hall effect, unconventional charge density wave, electronic nematicity and time-reversal symmetry breaking. Here we report an unexpected emergence of multiple flat bands in the AV$_3$Sb$_5$ superconductors. By performing high-resolution angle-resolved photoemission (ARPES) measurements, we observed four branches of flat bands that span over the entire momentum space. The appearance of the flat bands is not anticipated from the band structure calculations and cannot be accounted for by the known mechanisms of flat band generation. It is intimately related to the evolution of van Hove singularities. It is for the first time to observe such emergence of multiple flat bands in solid materials. Our findings provide new insights in revealing the underlying mechanism that governs the unusual behaviors in the Kagome superconductors. They also provide a new pathway in producing flat bands and set a platform to study the flat bands related physics. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: 20 pages, 4 figures

arXiv:2403.05753 [pdf, other]

UDCR: Unsupervised Aortic DSA/CTA Rigid Registration Using Deep Reinforcement Learning and Overlap Degree Calculation

Authors: Wentao Liu, Bowen Liang, Weijin Xu, Tong Tian, Qingsheng Lu, Xipeng Pan, Haoyuan Li, Siyu Tian, Huihua Yang, Ruisheng Su

Abstract: The rigid registration of aortic Digital Subtraction Angiography (DSA) and Computed Tomography Angiography (CTA) can provide 3D anatomical details of the vasculature for the interventional surgical treatment of conditions such as aortic dissection and aortic aneurysms, holding significant value for clinical research. However, the current methods for 2D/3D image registration are dependent on manual… ▽ More The rigid registration of aortic Digital Subtraction Angiography (DSA) and Computed Tomography Angiography (CTA) can provide 3D anatomical details of the vasculature for the interventional surgical treatment of conditions such as aortic dissection and aortic aneurysms, holding significant value for clinical research. However, the current methods for 2D/3D image registration are dependent on manual annotations or synthetic data, as well as the extraction of landmarks, which is not suitable for cross-modal registration of aortic DSA/CTA. In this paper, we propose an unsupervised method, UDCR, for aortic DSA/CTA rigid registration based on deep reinforcement learning. Leveraging the imaging principles and characteristics of DSA and CTA, we have constructed a cross-dimensional registration environment based on spatial transformations. Specifically, we propose an overlap degree calculation reward function that measures the intensity difference between the foreground and background, aimed at assessing the accuracy of registration between segmentation maps and DSA images. This method is highly flexible, allowing for the loading of pre-trained models to perform registration directly or to seek the optimal spatial transformation parameters through online learning. We manually annotated 61 pairs of aortic DSA/CTA for algorithm evaluation. The results indicate that the proposed UDCR achieved a Mean Absolute Error (MAE) of 2.85 mm in translation and 4.35° in rotation, showing significant potential for clinical applications. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05748 [pdf, other]

Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning

Authors: Wentao Liu, Tong Tian, Weijin Xu, Bowen Liang, Qingsheng Lu, Xipeng Pan, Wenyi Zhao, Huihua Yang, Ruisheng Su

Abstract: Autonomous robots in endovascular interventions possess the potential to navigate guidewires with safety and reliability, while reducing human error and shortening surgical time. However, current methods of guidewire navigation based on Reinforcement Learning (RL) depend on manual demonstration data or magnetic guidance. In this work, we propose an Image-guided Autonomous Guidewire Navigation (IAG… ▽ More Autonomous robots in endovascular interventions possess the potential to navigate guidewires with safety and reliability, while reducing human error and shortening surgical time. However, current methods of guidewire navigation based on Reinforcement Learning (RL) depend on manual demonstration data or magnetic guidance. In this work, we propose an Image-guided Autonomous Guidewire Navigation (IAGN) method. Specifically, we introduce BDA-star, a path planning algorithm with boundary distance constraints, for the trajectory planning of guidewire navigation. We established an IAGN-RL environment where the observations are real-time guidewire feeding images highlighting the position of the guidewire tip and the planned path. We proposed a reward function based on the distances from both the guidewire tip to the planned path and the target to evaluate the agent's actions. Furthermore, in policy network, we employ a pre-trained convolutional neural network to extract features, mitigating stability issues and slow convergence rates associated with direct learning from raw pixels. Experiments conducted on the aortic simulation IAGN platform demonstrated that the proposed method, targeting the left subclavian artery and the brachiocephalic artery, achieved a 100% guidewire navigation success rate, along with reduced movement and retraction distances and trajectories tend to the center of the vessels. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05428 [pdf, other]

Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

Authors: Bingbing Wang, Bin Liang, Chun-Mei Feng, Wangmeng Zuo, Zhixin Bai, Shijue Huang, Kam-Fai Wong, Xi Zeng, Ruifeng Xu

Abstract: In real-world conversations, the diversity and ambiguity of stickers often lead to varied interpretations based on the context, necessitating the requirement for comprehensively understanding stickers and supporting multi-tagging. To address this challenge, we introduce StickerTAG, the first multi-tag sticker dataset comprising a collected tag set with 461 tags and 13,571 sticker-tag pairs, design… ▽ More In real-world conversations, the diversity and ambiguity of stickers often lead to varied interpretations based on the context, necessitating the requirement for comprehensively understanding stickers and supporting multi-tagging. To address this challenge, we introduce StickerTAG, the first multi-tag sticker dataset comprising a collected tag set with 461 tags and 13,571 sticker-tag pairs, designed to provide a deeper understanding of stickers. Recognizing multiple tags for stickers becomes particularly challenging due to sticker tags usually are fine-grained attribute aware. Hence, we propose an Attentive Attribute-oriented Prompt Learning method, ie, Att$^2$PL, to capture informative features of stickers in a fine-grained manner to better differentiate tags. Specifically, we first apply an Attribute-oriented Description Generation (ADG) module to obtain the description for stickers from four attributes. Then, a Local Re-attention (LoR) module is designed to perceive the importance of local information. Finally, we use prompt learning to guide the recognition process and adopt confidence penalty optimization to penalize the confident output distribution. Extensive experiments show that our method achieves encouraging results for all commonly used metrics. △ Less

Submitted 16 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05427 [pdf, other]

Reply with Sticker: New Dataset and Model for Sticker Retrieval

Authors: Bin Liang, Bingbing Wang, Zhixin Bai, Qiwei Lang, Mingwei Sun, Kaiheng Hou, Kam-Fai Wong, Ruifeng Xu

Abstract: Using stickers in online chatting is very prevalent on social media platforms, where the stickers used in the conversation can express someone's intention/emotion/attitude in a vivid, tactful, and intuitive way. Existing sticker retrieval research typically retrieves stickers based on context and the current utterance delivered by the user. That is, the stickers serve as a supplement to the curren… ▽ More Using stickers in online chatting is very prevalent on social media platforms, where the stickers used in the conversation can express someone's intention/emotion/attitude in a vivid, tactful, and intuitive way. Existing sticker retrieval research typically retrieves stickers based on context and the current utterance delivered by the user. That is, the stickers serve as a supplement to the current utterance. However, in the real-world scenario, using stickers to express what we want to say rather than as a supplement to our words only is also important. Therefore, in this paper, we create a new dataset for sticker retrieval in conversation, called StickerInt, where stickers are used to reply to previous conversations or supplement our words. Based on the created dataset, we present a simple yet effective framework for sticker retrieval in conversation based on the learning of intention and the cross-modal relationships between conversation context and stickers, coined as \textbf{Int-RA}. Specifically, we first devise a knowledge-enhanced intention predictor to introduce the intention information into the conversation representations. Subsequently, a relation-aware sticker selector is devised to retrieve the response sticker via cross-modal relationships. Extensive experiments on the created dataset show that the proposed model achieves state-of-the-art performance in sticker retrieval. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.16288 [pdf, other]

PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Synthesis in Question Answering

Authors: Yiming Du, Hongru Wang, Zhengyi Zhao, Bin Liang, Baojun Wang, Wanjun Zhong, Zezhong Wang, Kam-Fai Wong

Abstract: Long-term memory plays a critical role in personal interaction, considering long-term memory can better leverage world knowledge, historical information, and preferences in dialogues. Our research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories, including world knowledge, profiles, social relationships, events, and dialogues. This dataset is collected to i… ▽ More Long-term memory plays a critical role in personal interaction, considering long-term memory can better leverage world knowledge, historical information, and preferences in dialogues. Our research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories, including world knowledge, profiles, social relationships, events, and dialogues. This dataset is collected to investigate the use of personalized memories, focusing on social interactions and events in the QA task. PerLTQA features two types of memory and a comprehensive benchmark of 8,593 questions for 30 characters, facilitating the exploration and application of personalized memories in Large Language Models (LLMs). Based on PerLTQA, we propose a novel framework for memory integration and generation, consisting of three main components: Memory Classification, Memory Retrieval, and Memory Synthesis. We evaluate this framework using five LLMs and three retrievers. Experimental results demonstrate that BERT-based classification models significantly outperform LLMs such as ChatGLM3 and ChatGPT in the memory classification task. Furthermore, our study highlights the importance of effective memory integration in the QA task. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.14298 [pdf, other]

Multi-modal Stance Detection: New Datasets and Model

Authors: Bin Liang, Ang Li, Jingqian Zhao, Lin Gui, Min Yang, Yue Yu, Kam-Fai Wong, Ruifeng Xu

Abstract: Stance detection is a challenging task that aims to identify public opinion from social media platforms with respect to specific targets. Previous work on stance detection largely focused on pure texts. In this paper, we study multi-modal stance detection for tweets consisting of texts and images, which are prevalent in today's fast-growing social media platforms where people often post multi-moda… ▽ More Stance detection is a challenging task that aims to identify public opinion from social media platforms with respect to specific targets. Previous work on stance detection largely focused on pure texts. In this paper, we study multi-modal stance detection for tweets consisting of texts and images, which are prevalent in today's fast-growing social media platforms where people often post multi-modal messages. To this end, we create five new multi-modal stance detection datasets of different domains based on Twitter, in which each example consists of a text and an image. In addition, we propose a simple yet effective Targeted Multi-modal Prompt Tuning framework (TMPT), where target information is leveraged to learn multi-modal stance features from textual and visual modalities. Experimental results on our five benchmark datasets show that the proposed TMPT achieves state-of-the-art performance in multi-modal stance detection. △ Less

Submitted 6 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: ACL'24 Findings

arXiv:2402.14296 [pdf, other]

Mitigating Biases of Large Language Models in Stance Detection with Calibration

Authors: Ang Li, Jingqian Zhao, Bin Liang, Lin Gui, Hui Wang, Xi Zeng, Xingwei Liang, Kam-Fai Wong, Ruifeng Xu

Abstract: Large language models (LLMs) have achieved remarkable progress in many natural language processing tasks. However, our experiment reveals that, in stance detection tasks, LLMs may generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics, thus harming their performance. Therefore, in this paper, we propose to Mitigate Biases of LLMs… ▽ More Large language models (LLMs) have achieved remarkable progress in many natural language processing tasks. However, our experiment reveals that, in stance detection tasks, LLMs may generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics, thus harming their performance. Therefore, in this paper, we propose to Mitigate Biases of LLMs in stance detection with Calibration (MB-Cal). To be specific, a novel calibration network is devised to calibrate potential bias in the stance prediction of LLMs. Further, to address the challenge of effectively learning bias representations and the difficulty in the generalizability of debiasing, we construct counterfactual augmented data. This approach enhances the calibration network, facilitating the debiasing and out-of-domain generalization. Experimental results on in-target and zero-shot stance detection tasks show that the proposed MB-Cal can effectively mitigate biases of LLMs, achieving state-of-the-art results. △ Less

Submitted 16 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.14228 [pdf, other]

COPR: Continual Human Preference Learning via Optimal Policy Regularization

Authors: Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

Abstract: Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences. Given the evolving nature of human preferences, continual alignment becomes more crucial and practical in comparison to traditional static alignment. Nevertheless, making RLHF compatible with Continual Learning (CL) is challenging due to its comple… ▽ More Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences. Given the evolving nature of human preferences, continual alignment becomes more crucial and practical in comparison to traditional static alignment. Nevertheless, making RLHF compatible with Continual Learning (CL) is challenging due to its complex process. Meanwhile, directly learning new human preferences may lead to Catastrophic Forgetting (CF) of historical preferences, resulting in helpless or harmful outputs. To overcome these challenges, we propose the Continual Optimal Policy Regularization (COPR) method, which draws inspiration from the optimal policy theory. COPR utilizes a sampling distribution as a demonstration and regularization constraints for CL. It adopts the Lagrangian Duality (LD) method to dynamically regularize the current policy based on the historically optimal policy, which prevents CF and avoids over-emphasizing unbalanced objectives. We also provide formal proof for the learnability of COPR. The experimental results show that COPR outperforms strong CL baselines on our proposed benchmark, in terms of reward-based, GPT-4 evaluations and human assessment. Furthermore, we validate the robustness of COPR under various CL settings, including different backbones, replay memory sizes, and learning orders. △ Less

Submitted 27 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.13091 [pdf, other]

Gravitational Wave Signal Extraction Against Non-Stationary Instrumental Noises with Deep Neural Network

Authors: Yuxiang Xu, Minghui Du, Peng Xu, Bo Liang, He Wang

Abstract: Sapce-borne gravitational wave antennas, such as LISA and LISA-like mission (Taiji and Tianqin), will offer novel perspectives for exploring our Universe while introduce new challenges, especially in data analysis. Aside from the known challenges like high parameter space dimension, superposition of large number of signals and etc., gravitational wave detections in space would be more seriously af… ▽ More Sapce-borne gravitational wave antennas, such as LISA and LISA-like mission (Taiji and Tianqin), will offer novel perspectives for exploring our Universe while introduce new challenges, especially in data analysis. Aside from the known challenges like high parameter space dimension, superposition of large number of signals and etc., gravitational wave detections in space would be more seriously affected by anomalies or non-stationarities in the science measurements. Considering the three types of foreseeable non-stationarities including data gaps, transients (glitches), and time-varying noise auto-correlations, which may come from routine maintenance or unexpected disturbances during science operations, we developed a deep learning model for accurate signal extractions confronted with such anomalous scenarios. Our model exhibits the same performance as the current state-of-the-art models do for the ideal and anomaly free scenario, while shows remarkable adaptability in extractions of coalescing massive black hole binary signal against all three types of non-stationarities and even their mixtures. This also provide new explorations into the robustness studies of deep learning models for data processing in space-borne gravitational wave missions. △ Less

Submitted 29 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: 12 pages, 11 figures, 5 tables

arXiv:2402.07412 [pdf, other]

Auxiliary Reward Generation with Transition Distance Representation Learning

Authors: Siyuan Li, Shijie Han, Yingnan Zhao, By Liang, Peng Liu

Abstract: Reinforcement learning (RL) has shown its strength in challenging sequential decision-making problems. The reward function in RL is crucial to the learning performance, as it serves as a measure of the task completion degree. In real-world problems, the rewards are predominantly human-designed, which requires laborious tuning, and is easily affected by human cognitive biases. To achieve automatic… ▽ More Reinforcement learning (RL) has shown its strength in challenging sequential decision-making problems. The reward function in RL is crucial to the learning performance, as it serves as a measure of the task completion degree. In real-world problems, the rewards are predominantly human-designed, which requires laborious tuning, and is easily affected by human cognitive biases. To achieve automatic auxiliary reward generation, we propose a novel representation learning approach that can measure the ``transition distance'' between states. Building upon these representations, we introduce an auxiliary reward generation technique for both single-task and skill-chaining scenarios without the need for human knowledge. The proposed approach is evaluated in a wide range of manipulation tasks. The experiment results demonstrate the effectiveness of measuring the transition distance between states and the induced improvement by auxiliary rewards, which not only promotes better learning efficiency but also increases convergent stability. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.04933 [pdf, other]

A Bayesian Approach to Online Learning for Contextual Restless Bandits with Applications to Public Health

Authors: Biyonka Liang, Lily Xu, Aparna Taneja, Milind Tambe, Lucas Janson

Abstract: Public health programs often provide interventions to encourage beneficiary adherence,and effectively allocating interventions is vital for producing the greatest overall health outcomes. Such resource allocation problems are often modeled as restless multi-armed bandits (RMABs) with unknown underlying transition dynamics, hence requiring online reinforcement learning (RL). We present Bayesian Lea… ▽ More Public health programs often provide interventions to encourage beneficiary adherence,and effectively allocating interventions is vital for producing the greatest overall health outcomes. Such resource allocation problems are often modeled as restless multi-armed bandits (RMABs) with unknown underlying transition dynamics, hence requiring online reinforcement learning (RL). We present Bayesian Learning for Contextual RMABs (BCoR), an online RL approach for RMABs that novelly combines techniques in Bayesian modeling with Thompson sampling to flexibly model the complex RMAB settings present in public health program adherence problems, such as context and non-stationarity. BCoR's key strength is the ability to leverage shared information within and between arms to learn the unknown RMAB transition dynamics quickly in intervention-scarce settings with relatively short time horizons, which is common in public health applications. Empirically, BCoR achieves substantially higher finite-sample performance over a range of experimental settings, including an example based on real-world adherence data that was developed in collaboration with ARMMAN, an NGO in India which runs a large-scale maternal health program, showcasing BCoR practical utility and potential for real-world deployment. △ Less

Submitted 27 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: 26 pages, 18 figures

arXiv:2401.09819 [pdf, other]

PPNet: A Two-Stage Neural Network for End-to-end Path Planning

Authors: Qinglong Meng, Chongkun Xia, Xueqian Wang, Songping Mai, Bin Liang

Abstract: The classical path planners, such as sampling-based path planners, can provide probabilistic completeness guarantees in the sense that the probability that the planner fails to return a solution if one exists, decays to zero as the number of samples approaches infinity. However, finding a near-optimal feasible solution in a given period is challenging in many applications such as the autonomous ve… ▽ More The classical path planners, such as sampling-based path planners, can provide probabilistic completeness guarantees in the sense that the probability that the planner fails to return a solution if one exists, decays to zero as the number of samples approaches infinity. However, finding a near-optimal feasible solution in a given period is challenging in many applications such as the autonomous vehicle. To achieve an end-to-end near-optimal path planner, we first divide the path planning problem into two subproblems, which are path space segmentation and waypoints generation in the given path's space. We further propose a two-stage neural network named Path Planning Network (PPNet) each stage solves one of the subproblems abovementioned. Moreover, we propose a novel efficient data generation method for path planning named EDaGe-PP. EDaGe-PP can generate data with continuous-curvature paths with analytical expression while satisfying the clearance requirement. The results show the total computation time of generating random 2D path planning data is less than 1/33 and the success rate of PPNet trained by the dataset that is generated by EDaGe-PP is about 2 times compared to other methods. We validate PPNet against state-of-the-art path planning methods. The results show that PPNet can find a near-optimal solution in 15.3ms, which is much shorter than the state-of-the-art path planners. △ Less

Submitted 23 April, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.09214 [pdf, ps, other]

Conditional sofic mean dimension

Authors: Bingbing Liang

Abstract: We undertake a study of the conditional mean dimensions for a factor map between continuous actions of a sofic group on two compact metrizable spaces. When the group is infinitely amenable, all these concepts recover as the conditional mean dimensions introduced in \cite{L22}. A range of results established for actions of amenable groups are extended to the sofic framework. Additionally, our exp… ▽ More We undertake a study of the conditional mean dimensions for a factor map between continuous actions of a sofic group on two compact metrizable spaces. When the group is infinitely amenable, all these concepts recover as the conditional mean dimensions introduced in \cite{L22}. A range of results established for actions of amenable groups are extended to the sofic framework. Additionally, our exploration encompasses the study of the relative mean dimension introduced by Tsukamoto, shedding light on its inherent correlation with the conditional metric mean dimension within the sofic context. A lower bound on the conditional metric mean dimension, originally proposed by Shi-Tsukamoto, is extended to the sofic case. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: 37 pages. Comments are welcome!

MSC Class: 37B02(Primary); 54E45(Secondary)

arXiv:2401.00747 [pdf, other]

Polynomial-time Approximation Scheme for Equilibriums of Games

Authors: Hongbo Sun, Chongkun Xia, Junbo Tan, Bo Yuan, Xueqian Wang, Bin Liang

Abstract: Whether a PTAS (polynomial-time approximation scheme) exists for equilibriums of games has been an open question, which relates to questions in three fields, the practicality of methods in algorithmic game theory, the equation PPAD=FP about the two complexity classes in computational complexity theory, and non-stationarity and curse of multiagency in MARL (multi-agent reinforcement learning). This… ▽ More Whether a PTAS (polynomial-time approximation scheme) exists for equilibriums of games has been an open question, which relates to questions in three fields, the practicality of methods in algorithmic game theory, the equation PPAD=FP about the two complexity classes in computational complexity theory, and non-stationarity and curse of multiagency in MARL (multi-agent reinforcement learning). This paper introduces our discovery of the sufficient and necessary conditions for iterations based on dynamic programming and line search to approximate perfect equilibriums of dynamic games, out of which we construct a method proved to be a FPTAS (fully PTAS) for non-singular perfect equilibriums of dynamic games, where for almost any given dynamic game, all its perfect equilibriums are non-singular, indicating that FP$\subseteq$PPAD$\subseteq$Almost-FP. Our discovery consists of cone interior dynamic programming and primal-dual unbiased regret minimization, which fit into existing theories by degeneration in a structure-preserving manner. The former enables a dynamic programming operator to iteratively converge to a perfect equilibrium based on a concept called policy cone. The latter enables an interior-point line search to approximate a Nash equilibrium based on two concepts called primal-dual bias and unbiased central variety, solving a subproblem of the former. Validity of our discovery is cross-corroborated by a combination of theorem proofs, graphs of the three main concepts, and experimental results. △ Less

Submitted 3 June, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

Comments: 23 pages, 7 figures, code and animation are available at https://github.com/shb20tsinghua/PTAS_Game/tree/main

MSC Class: 90C39; 90C51; 91A15

arXiv:2312.13424 [pdf, ps, other]

Multi-Model Wireless Federated Learning with Downlink Beamforming

Authors: Chong Zhang, Min Dong, Ben Liang, Ali Afana, Yahia Ahmed

Abstract: This paper studies the design of wireless federated learning (FL) for simultaneously training multiple machine learning models. We consider round robin device-model assignment and downlink beamforming for concurrent multiple model updates. After formulating the joint downlink-uplink transmission process, we derive the per-model global update expression over communication rounds, capturing the effe… ▽ More This paper studies the design of wireless federated learning (FL) for simultaneously training multiple machine learning models. We consider round robin device-model assignment and downlink beamforming for concurrent multiple model updates. After formulating the joint downlink-uplink transmission process, we derive the per-model global update expression over communication rounds, capturing the effect of beamforming and noisy reception. To maximize the multi-model training convergence rate, we derive an upper bound on the optimality gap of the global model update and use it to formulate a multi-group multicast beamforming problem. We show that this problem can be converted to minimizing the sum of inverse received signal-to-interference-plus-noise ratios, which can be solved efficiently by projected gradient descent. Simulation shows that our proposed multi-model FL solution outperforms other alternatives, including conventional single-model sequential training and multi-model zero-forcing beamforming. △ Less

Submitted 14 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: 6 pages, 4 figures. Accepted by IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

arXiv:2312.06302 [pdf]

Non-iterative Methods in Inhomogeneous Background Inverse Scattering Imaging Problem Assisted by Swin Transformer Network

Authors: Naike Du, Tiantian Yin, Jing Wang, Rencheng Song, Kuiwen Xu, Bingyuan Liang, Sheng Sun, Xiuzhu Ye

Abstract: A deep learning-assisted inversion method is proposed to solve the inhomogeneous background imaging problem. Three non-iterative methods, namely the distorted-Born (DB) major current coefficients method, the DB modified Born approximation method, and the DB connection method, are introduced to address the inhomogeneous background inverse scattering problem. These methods retain the multiple scatte… ▽ More A deep learning-assisted inversion method is proposed to solve the inhomogeneous background imaging problem. Three non-iterative methods, namely the distorted-Born (DB) major current coefficients method, the DB modified Born approximation method, and the DB connection method, are introduced to address the inhomogeneous background inverse scattering problem. These methods retain the multiple scattering information by utilizing the major current obtained through singular value decomposition of the Green's function and the scattered field, without resourcing to optimization techniques. As a result, the proposed methods offer improved reconstruction resolution and accuracy for unknown objects embedded in inhomogeneous backgrounds, surpassing the backpropagation scheme (BPS) and Born approximation (BA) method that disregard the multiple scattering effect. To further enhance the resolution and accuracy of the reconstruction, a Shifted-Window (Swin) transformer network is employed for capturing super-resolution information in the images. The attention mechanism incorporated in the shifted window facilitates global interactions between objects, thereby enhancing the performance of the inhomogeneous background imaging algorithm while reducing computational complexity. Moreover, an adaptive training method is proposed to enhance the generalization ability of the network. The effectiveness of the proposed methods is demonstrated through both synthetic data and experimental data. Notably, super-resolution imaging is achieved with quasi real-time speed, indicating promising application potential for the proposed algorithms. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: We have submitted this paper to TGRS(IEEE Transactionson Geoscience andRemote Sensing) on 29-Jan-2023; and resubmitted on 12-Jul-2023

arXiv:2311.17694 [pdf, other]

Intrinsic Electronic Structure and Nodeless Superconducting Gap of $\mathrm{YBa_{2} Cu_{3} O_{7-δ} }$ Observed by Spatially-Resolved Laser-Based Angle Resolved Photoemission Spectroscopy

Authors: Shuaishuai Li, Taimin Miao, Chaohui Yin, Yinghao Li, Hongtao Yan, Yiwen Chen, Bo Liang, Hao Chen, Wenpei Zhu, Shenjin Zhang, Zhimin Wang, Fengfeng Zhang, Feng Yang, Qinjun Peng, Chengtian Lin, Hanqing Mao, Guodong Liu, Zuyan Xu, Lin Zhao, X. J. Zhou

Abstract: The spatially-resolved laser-based high resolution ARPES measurements have been performed on the optimally-doped $\mathrm{YBa_{2} Cu_{3} O_{7-δ} }$ (Y123) superconductor. For the first time, we found the region from the cleaved surface that reveals clear bulk electronic properties. The intrinsic Fermi surface and band structures of Y123 are observed. The Fermi surface-dependent and momentum-depend… ▽ More The spatially-resolved laser-based high resolution ARPES measurements have been performed on the optimally-doped $\mathrm{YBa_{2} Cu_{3} O_{7-δ} }$ (Y123) superconductor. For the first time, we found the region from the cleaved surface that reveals clear bulk electronic properties. The intrinsic Fermi surface and band structures of Y123 are observed. The Fermi surface-dependent and momentum-dependent superconducting gap is determined which is nodeless and consistent with the d+is gap form. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Journal ref: Chinese Physics B 32, 117401 (2023)

arXiv:2311.06122 [pdf, other]

Fight Fire with Fire: Combating Adversarial Patch Attacks using Pattern-randomized Defensive Patches

Authors: Jianan Feng, Jiachun Li, Changqing Miao, Jianjun Huang, Wei You, Wenchang Shi, Bin Liang

Abstract: Object detection has found extensive applications in various tasks, but it is also susceptible to adversarial patch attacks. Existing defense methods often necessitate modifications to the target model or result in unacceptable time overhead. In this paper, we adopt a counterattack approach, following the principle of "fight fire with fire," and propose a novel and general methodology for defendin… ▽ More Object detection has found extensive applications in various tasks, but it is also susceptible to adversarial patch attacks. Existing defense methods often necessitate modifications to the target model or result in unacceptable time overhead. In this paper, we adopt a counterattack approach, following the principle of "fight fire with fire," and propose a novel and general methodology for defending adversarial attacks. We utilize an active defense strategy by injecting two types of defensive patches, canary and woodpecker, into the input to proactively probe or weaken potential adversarial patches without altering the target model. Moreover, inspired by randomization techniques employed in software security, we employ randomized canary and woodpecker injection patterns to defend against defense-aware attacks. The effectiveness and practicality of the proposed method are demonstrated through comprehensive experiments. The results illustrate that canary and woodpecker achieve high performance, even when confronted with unknown attack methods, while incurring limited time overhead. Furthermore, our method also exhibits sufficient robustness against defense-aware attacks, as evidenced by adaptive attack experiments. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.05794 [pdf, other]

An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed Bandits

Authors: Biyonka Liang, Iavor Bojinov

Abstract: Experimentation is crucial for managers to rigorously quantify the value of a change and determine if it leads to a statistically significant improvement over the status quo, thus augmenting their decision-making. Many companies now mandate that all changes undergo experimentation, presenting two challenges: (1) reducing the risk/cost of experimentation by minimizing the proportion of customers as… ▽ More Experimentation is crucial for managers to rigorously quantify the value of a change and determine if it leads to a statistically significant improvement over the status quo, thus augmenting their decision-making. Many companies now mandate that all changes undergo experimentation, presenting two challenges: (1) reducing the risk/cost of experimentation by minimizing the proportion of customers assigned to the inferior treatment and (2) increasing the experimentation velocity by enabling managers to stop experiments as soon as results are statistically significant. This paper simultaneously addresses both challenges by proposing the Mixture Adaptive Design (MAD), a new experimental design for multi-armed bandit (MAB) algorithms that enables anytime valid inference on the Average Treatment Effect (ATE) for any MAB algorithm. Intuitively, the MAB "mixes" any bandit algorithm with a Bernoulli design such that at each time step, the probability that a customer is assigned via the Bernoulli design is controlled by a user-specified deterministic sequence that can converge to zero. The sequence enables managers to directly and interpretably control the trade-off between regret minimization and inferential precision. Under mild conditions on the rate the sequence converges to zero, we provide a confidence sequence that is asymptotically anytime valid and demonstrate that the MAD is guaranteed to have a finite stopping time in the presence of a true non-zero ATE. Hence, the MAD allows managers to stop experiments early when a significant ATE is detected while ensuring valid inference, enhancing both the efficiency and reliability of adaptive experiments. Empirically, we demonstrate that the MAD achieves finite-sample anytime-validity while accurately and precisely estimating the ATE, all without incurring significant losses in reward compared to standard bandit designs. △ Less

Submitted 14 June, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

arXiv:2310.19510 [pdf, other]

Photophysics of O-band and transition metal color centers in monolithic silicon for quantum communications

Authors: Murat Can Sarihan, Jiahui Huang, Jin Ho Kang, Cody Fan, Wei Liu, Khalifa M. Azizur-Rahman, Baolai Liang, Chee Wei Wong

Abstract: Color centers at the low-dispersion O-band wavelengths are an essential resource for long-lifetime quantum network nodes toward memory-assisted quantum communications using energy-time entanglement. In this work, we explore the process of developing T centers and other color center defects to improve qubit storage and radiative efficiency while examining the photoluminescence dynamics. We have ext… ▽ More Color centers at the low-dispersion O-band wavelengths are an essential resource for long-lifetime quantum network nodes toward memory-assisted quantum communications using energy-time entanglement. In this work, we explore the process of developing T centers and other color center defects to improve qubit storage and radiative efficiency while examining the photoluminescence dynamics. We have extended the $TX_{0}$ lifetime of T centers by 65% to 1.56 $μ$s. Furthermore, we discover the presence of a $^*Cu_n^m$ related doublet emission around 1312 nm close to the zero-dispersion wavelength, with a spin degeneracy resulting in a magnetic-field induced broadening by 25% under 0.5 T, which can be an alternative to T centers as a high-fidelity spin-photon interface. △ Less

Submitted 1 December, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: 11 pages, 5 figures

arXiv:2310.13800 [pdf, other]

Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks

Authors: Andrea Sottana, Bin Liang, Kai Zou, Zheng Yuan

Abstract: Large Language Models (LLMs) evaluation is a patchy and inconsistent landscape, and it is becoming clear that the quality of automatic evaluation metrics is not keeping up with the pace of development of generative models. We aim to improve the understanding of current models' performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three… ▽ More Large Language Models (LLMs) evaluation is a patchy and inconsistent landscape, and it is becoming clear that the quality of automatic evaluation metrics is not keeping up with the pace of development of generative models. We aim to improve the understanding of current models' performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three NLP benchmarks: text summarisation, text simplification and grammatical error correction (GEC), using both automatic and human evaluation. We also explore the potential of the recently released GPT-4 to act as an evaluator. We find that ChatGPT consistently outperforms many other popular models according to human reviewers on the majority of metrics, while scoring much more poorly when using classic automatic evaluation metrics. We also find that human reviewers rate the gold reference as much worse than the best models' outputs, indicating the poor quality of many popular benchmarks. Finally, we find that GPT-4 is capable of ranking models' outputs in a way which aligns reasonably closely to human judgement despite task-specific variations, with a lower alignment in the GEC task. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted at EMNLP 2023

arXiv:2310.02572 [pdf, other]

Improving Knowledge Distillation with Teacher's Explanation

Authors: Sayantan Chowdhury, Ben Liang, Ali Tizghadam, Ilijc Albanese

Abstract: Knowledge distillation (KD) improves the performance of a low-complexity student model with the help of a more powerful teacher. The teacher in KD is a black-box model, imparting knowledge to the student only through its predictions. This limits the amount of transferred knowledge. In this work, we introduce a novel Knowledge Explaining Distillation (KED) framework, which allows the student to lea… ▽ More Knowledge distillation (KD) improves the performance of a low-complexity student model with the help of a more powerful teacher. The teacher in KD is a black-box model, imparting knowledge to the student only through its predictions. This limits the amount of transferred knowledge. In this work, we introduce a novel Knowledge Explaining Distillation (KED) framework, which allows the student to learn not only from the teacher's predictions but also from the teacher's explanations. We propose a class of superfeature-explaining teachers that provide explanation over groups of features, along with the corresponding student model. We also present a method for constructing the superfeatures. We then extend KED to reduce complexity in convolutional neural networks, to allow augmentation with hidden-representation distillation methods, and to work with a limited amount of training data using chimeric sets. Our experiments over a variety of datasets show that KED students can substantially outperform KD students of similar complexity. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.15183 [pdf, other]

doi 10.1145/3618334

The Shortest Route Is Not Always the Fastest: Probability-Modeled Stereoscopic Eye Movement Completion Time in VR

Authors: Budmonde Duinkharjav, Benjamin Liang, Anjul Patney, Rachel Brown, Qi Sun

Abstract: Speed and consistency of target-shifting play a crucial role in human ability to perform complex tasks. Shifting our gaze between objects of interest quickly and consistently requires changes both in depth and direction. Gaze changes in depth are driven by slow, inconsistent vergence movements which rotate the eyes in opposite directions, while changes in direction are driven by ballistic, consist… ▽ More Speed and consistency of target-shifting play a crucial role in human ability to perform complex tasks. Shifting our gaze between objects of interest quickly and consistently requires changes both in depth and direction. Gaze changes in depth are driven by slow, inconsistent vergence movements which rotate the eyes in opposite directions, while changes in direction are driven by ballistic, consistent movements called saccades, which rotate the eyes in the same direction. In the natural world, most of our eye movements are a combination of both types. While scientific consensus on the nature of saccades exists, vergence and combined movements remain less understood and agreed upon. We eschew the lack of scientific consensus in favor of proposing an operationalized computational model which predicts the speed of any type of gaze movement during target-shifting in 3D. To this end, we conduct a psychophysical study in a stereo VR environment to collect more than 12,000 gaze movement trials, analyze the temporal distribution of the observed gaze movements, and fit a probabilistic model to the data. We perform a series of objective measurements and user studies to validate the model. The results demonstrate its predictive accuracy, generalization, as well as applications for optimizing visual performance by altering content placement. Lastly, we leverage the model to measure differences in human target-changing time relative to the natural world, as well as suggest scene-aware projection depth. By incorporating the complexities and randomness of human oculomotor control, we hope this research will support new behavior-aware metrics for VR/AR display design, interface layout, and gaze-contingent rendering. △ Less

Submitted 3 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

arXiv:2309.14720 [pdf, other]

Learning to Assist Different Wearers in Multitasks: Efficient and Individualized Human-In-the-Loop Adaption Framework for Exoskeleton Robots

Authors: Yu Chen, Gong Chen, Jing Ye, Chenglong Fu, Bin Liang, Xiang Li

Abstract: One of the typical purposes of using lower-limb exoskeleton robots is to provide assistance to the wearer by supporting their weight and augmenting their physical capabilities according to a given task and human motion intentions. The generalizability of robots across different wearers in multiple tasks is important to ensure that the robot can provide correct and effective assistance in actual im… ▽ More One of the typical purposes of using lower-limb exoskeleton robots is to provide assistance to the wearer by supporting their weight and augmenting their physical capabilities according to a given task and human motion intentions. The generalizability of robots across different wearers in multiple tasks is important to ensure that the robot can provide correct and effective assistance in actual implementation. However, most lower-limb exoskeleton robots exhibit only limited generalizability. Therefore, this paper proposes a human-in-the-loop learning and adaptation framework for exoskeleton robots to improve their performance in various tasks and for different wearers. To suit different wearers, an individualized walking trajectory is generated online using dynamic movement primitives and Bayes optimization. To accommodate various tasks, a task translator is constructed using a neural network to generalize a trajectory to more complex scenarios. These generalization techniques are integrated into a unified variable impedance model, which regulates the exoskeleton to provide assistance while ensuring safety. In addition, an anomaly detection network is developed to quantitatively evaluate the wearer's comfort, which is considered in the trajectory learning procedure and contributes to the relaxation of conflicts in impedance control. The proposed framework is easy to implement, because it requires proprioceptive sensors only to perform and deploy data-efficient learning schemes. This makes the exoskeleton practical for deployment in complex scenarios, accommodating different walking patterns, habits, tasks, and conflicts. Experiments and comparative studies on a lower-limb exoskeleton robot are performed to demonstrate the effectiveness of the proposed framework. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 16 pages journal article

arXiv:2309.09167 [pdf]

From Knowing to Doing: Learning Diverse Motor Skills through Instruction Learning

Authors: Linqi Ye, Jiayi Li, Yi Cheng, Xianhao Wang, Bin Liang, Yan Peng

Abstract: Recent years have witnessed many successful trials in the robot learning field. For contact-rich robotic tasks, it is challenging to learn coordinated motor skills by reinforcement learning. Imitation learning solves this problem by using a mimic reward to encourage the robot to track a given reference trajectory. However, imitation learning is not so efficient and may constrain the learned motion… ▽ More Recent years have witnessed many successful trials in the robot learning field. For contact-rich robotic tasks, it is challenging to learn coordinated motor skills by reinforcement learning. Imitation learning solves this problem by using a mimic reward to encourage the robot to track a given reference trajectory. However, imitation learning is not so efficient and may constrain the learned motion. In this paper, we propose instruction learning, which is inspired by the human learning process and is highly efficient, flexible, and versatile for robot motion learning. Instead of using a reference signal in the reward, instruction learning applies a reference signal directly as a feedforward action, and it is combined with a feedback action learned by reinforcement learning to control the robot. Besides, we propose the action bounding technique and remove the mimic reward, which is shown to be crucial for efficient and flexible learning. We compare the performance of instruction learning with imitation learning, indicating that instruction learning can greatly speed up the training process and guarantee learning the desired motion correctly. The effectiveness of instruction learning is validated through a bunch of motion learning examples for a biped robot and a quadruped robot, where skills can be learned typically within several million steps. Besides, we also conduct sim-to-real transfer and online learning experiments on a real quadruped robot. Instruction learning has shown great merits and potential, making it a promising alternative for imitation learning. △ Less

Submitted 1 November, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

arXiv:2309.04989 [pdf]

Non-zero Integral Spin of Acoustic Vortices and Spin-orbit Interaction in Longitudinal Acoustics

Authors: Wei Wang, Yang Tan, Jingjing Liu, Bin Liang, Jianchun Cheng

Abstract: Spin and orbital angular momenta (AM) are of fundamental interest in wave physics. Acoustic wave, as a typical longitudinal wave, has been well studied in terms of orbital AM, but still considered unable to carry non-zero integral spin AM or spin-orbital interaction in homogeneous media due to its spin-0 nature. Here we give the first self-consistent analytical calculations of spin, orbital and to… ▽ More Spin and orbital angular momenta (AM) are of fundamental interest in wave physics. Acoustic wave, as a typical longitudinal wave, has been well studied in terms of orbital AM, but still considered unable to carry non-zero integral spin AM or spin-orbital interaction in homogeneous media due to its spin-0 nature. Here we give the first self-consistent analytical calculations of spin, orbital and total AM of guided vortices under different boundary conditions, revealing that vortex field can carry non-zero integral spin AM. We also introduce for acoustic waves the canonical-Minkowski and kinetic-Abraham AM, which has aroused long-lasting debate in optics, and prove that only the former is conserved with the corresponding symmetries. Furthermore, we present the theoretical and experimental observation of the spin-orbit interaction of vortices in longitudinal acoustics, which is thought beyond attainable in longitudinal waves in the absence of spin degree of freedom. Our work provides a solid platform for future studies of the spin and orbital AM of guided acoustic waves and may open up a new dimension for acoustic vortex-based applications such as underwater communications and object manipulations. △ Less

Submitted 10 September, 2023; originally announced September 2023.

arXiv:2309.04056 [pdf, ps, other]

Multi-discontinuous Functional based Sliding Mode Cascade Observer for Estimation and Closed-loop Compensation Controller

Authors: Yiyong Sun, Zhang Chen, Guang Zhai, Bin Liang

Abstract: The sliding mode observer is a useful method for estimating the system state and the unknown disturbance. However, the traditional single-layer observer might still suffer from high pulse when the output measurement is mixed with noise. To improve the estimation quality, a new cascade sliding mode observer containing multiple discontinuous functions is proposed in this letter. The proposed observe… ▽ More The sliding mode observer is a useful method for estimating the system state and the unknown disturbance. However, the traditional single-layer observer might still suffer from high pulse when the output measurement is mixed with noise. To improve the estimation quality, a new cascade sliding mode observer containing multiple discontinuous functions is proposed in this letter. The proposed observer consists of two layers: the first layer is a traditional sliding mode observer, and the second layer is a cascade observer. The measurement noise issue is considered in the source system model. An alternative method how to design the observer gains of the two layers, together with how to examine the effectiveness of the compensator based closed-loop system, are offered. A numerical example is provided to demonstrate the effectiveness of the proposed method. The observation structure proposed in this letter not only smooths the estimated state but also reduces the control consumption. △ Less

Submitted 26 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2309.01148 [pdf, other]

doi 10.1038/s41467-024-48701-7

Orbital-Dependent Electron Correlation in Double-Layer Nickelate La3Ni2O7

Authors: Jiangang Yang, Hualei Sun, Xunwu Hu, Yuyang Xie, Taimin Miao, Hailan Luo, Hao Chen, Bo Liang, Wenpei Zhu, Gexing Qu, Cui-Qun Chen, Mengwu Huo, Yaobo Huang, Shenjin Zhang, Fengfeng Zhang, Feng Yang, Zhimin Wang, Qinjun Peng, Hanqing Mao, Guodong Liu, Zuyan Xu, Tian Qian, Dao-Xin Yao, Meng Wang, Lin Zhao , et al. (1 additional authors not shown)

Abstract: The latest discovery of high temperature superconductivity near 80K in La3Ni2O7 under high pressure has attracted much attention. Many proposals are put forth to understand the origin of superconductivity.The determination of electronic structures is a prerequisite to establish theories to understand superconductivity in nickelates but is still lacking. Here we report our direct measurement of the… ▽ More The latest discovery of high temperature superconductivity near 80K in La3Ni2O7 under high pressure has attracted much attention. Many proposals are put forth to understand the origin of superconductivity.The determination of electronic structures is a prerequisite to establish theories to understand superconductivity in nickelates but is still lacking. Here we report our direct measurement of the electronic structures of La3Ni2O7 by high-resolution angle resolved photoemission spectroscopy. The Fermi surface and band structures of La3Ni2O7 are observed and compared with the band structure calculations. Strong electron correlations are revealed which are orbital- and momentum dependent. A flat band is formed from the Ni-3dz2 orbitals around the zone corner which is ~50meV below the Fermi level and exhibits the strongest electron correlation. In many theoretical proposals, this band is expected to play the dominant role in generating superconductivity in La3Ni2O7. Our observations provide key experimental information to understand the electronic structure and origin of high temperature superconductivity in La3Ni2O7. △ Less

Submitted 2 June, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

Comments: 20 pages, 5 figures

Journal ref: Nature Communications 15, 4373 (2024)

arXiv:2308.16079 [pdf, ps, other]

Entanglement Dynamics of two Non-Hermitian Qubits

Authors: Yi-Xi Zhang, Zhen-Tao Zhang, Xiao-Zhi Wei, Bao-Long Liang, Feng Mei, Zhen-Shan Yang

Abstract: The evolution of entanglement in a non-Hermitian quantum system may behave differently compared to its Hermitian counterpart. In this paper, we investigate the entanglement dynamics of two coupled and driven non-Hermitian qubits. Through calculating the concurrence of the system, we find that the evolution of the bipartite entanglement manifests two distinct patterns in the parameter space. In the… ▽ More The evolution of entanglement in a non-Hermitian quantum system may behave differently compared to its Hermitian counterpart. In this paper, we investigate the entanglement dynamics of two coupled and driven non-Hermitian qubits. Through calculating the concurrence of the system, we find that the evolution of the bipartite entanglement manifests two distinct patterns in the parameter space. In the low non-Hermiticity regime, the concurrence oscillates significantly, while in the opposite regime the same quantity would trend to a stable value. We attribute this phenomenon to parity-time ($ \mathcal{PT}$) symmetry phase transition. In addition, we have also studied the effect of decoherence on the entanglement dynamics. Our research provides a method to stabilize entanglement by exploiting non-Hermiticity. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 6 pages, 5 figures

arXiv:2308.05510 [pdf, other]

doi 10.1007/s11433-023-2270-7

Advancing Space-Based Gravitational Wave Astronomy: Rapid Parameter Estimation via Normalizing Flows

Authors: Minghui Du, Bo Liang, He Wang, Peng Xu, Ziren Luo, Yueliang Wu

Abstract: Gravitational wave (GW) astronomy is witnessing a transformative shift from terrestrial to space-based detection, with missions like Taiji at the forefront. While the transition brings unprecedented opportunities for exploring massive black hole binaries (MBHBs), it also imposes complex challenges in data analysis, particularly in parameter estimation amidst confusion noise. Addressing this gap, w… ▽ More Gravitational wave (GW) astronomy is witnessing a transformative shift from terrestrial to space-based detection, with missions like Taiji at the forefront. While the transition brings unprecedented opportunities for exploring massive black hole binaries (MBHBs), it also imposes complex challenges in data analysis, particularly in parameter estimation amidst confusion noise. Addressing this gap, we utilize scalable normalizing flow models to achieve rapid and accurate inference within the Taiji environment. Innovatively, our approach simplifies the data's complexity, employs a transformation mapping to overcome the year-period time-dependent response function, and unveils additional multimodality in the arrival time parameter. Our method estimates MBHBs several orders of magnitude faster than conventional techniques, maintaining high accuracy even in complex backgrounds. These findings significantly enhance the efficiency of GW data analysis, paving the way for rapid detection and alerting systems and enriching our ability to explore the universe through space-based GW observation. △ Less

Submitted 20 February, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: 14 pages, 7 figures. Published version

Journal ref: SCIENCE CHINA Physics, Mechanics & Astronomy, Volume 67, Issue 3: 230412 (2024)

Showing 1–50 of 242 results for author: Liang, B